Factory Methods
Valkeyrie provides powerful factory methods to create and populate databases from existing data sources. This is perfect for data migrations, imports, seeding databases, or creating databases from API responses.
Table of Contents
- Overview
- Valkeyrie.from() - Synchronous Data
- Valkeyrie.fromAsync() - Asynchronous Data
- Options Reference
- Key Extraction
- Progress Tracking
- Error Handling
- Performance
- With Schema Validation
- Real-World Examples
- Best Practices
Overview
Instead of opening an empty database and inserting items one by one, factory methods let you create and populate a database in one operation:
// ❌ The hard way
const db = await Valkeyrie.open('./users.db');
for (const user of users) {
await db.set(['users', user.id], user);
}
// ✅ The easy way
const db = await Valkeyrie.from(users, {
prefix: ['users'],
keyProperty: 'id',
path: './users.db'
});Benefits:
- Much faster (batched atomic operations)
- Less code
- Progress tracking built-in
- Flexible error handling
- Works with any iterable source
Valkeyrie.from() - Synchronous Data
Use from() for synchronous iterables: arrays, Sets, Maps, or custom iterables.
Basic Usage
import { Valkeyrie } from 'valkeyrie';
const users = [
{ id: 1, name: 'Alice', email: 'alice@example.com' },
{ id: 2, name: 'Bob', email: 'bob@example.com' },
{ id: 3, name: 'Charlie', email: 'charlie@example.com' }
];
const db = await Valkeyrie.from(users, {
prefix: ['users'],
keyProperty: 'id'
});
// Database is now populated!
const alice = await db.get(['users', 1]);
console.log(alice.value); // { id: 1, name: 'Alice', email: 'alice@example.com' }From Arrays
The most common use case:
const products = [
{ sku: 'LAPTOP-1', name: 'Laptop', price: 999 },
{ sku: 'MOUSE-1', name: 'Mouse', price: 29 },
{ sku: 'KEYBOARD-1', name: 'Keyboard', price: 79 }
];
const db = await Valkeyrie.from(products, {
prefix: ['products'],
keyProperty: 'sku',
path: './products.db'
});From Sets
const uniqueEmails = new Set([
'alice@example.com',
'bob@example.com',
'charlie@example.com'
]);
const db = await Valkeyrie.from(uniqueEmails, {
prefix: ['emails'],
keyProperty: (email) => email, // Use the email itself as key
path: './emails.db'
});
// Access by email
const exists = await db.get(['emails', 'alice@example.com']);From Maps
const configMap = new Map([
['database_url', 'postgresql://...'],
['api_key', 'secret-key'],
['max_connections', 100]
]);
const db = await Valkeyrie.from(configMap.entries(), {
prefix: ['config'],
keyProperty: ([key]) => key, // Extract key from [key, value] tuple
path: './config.db'
});From Custom Iterables
class UserRepository {
*getAll() {
// Custom iteration logic
for (let i = 1; i <= 100; i++) {
yield {
id: i,
name: `User ${i}`,
email: `user${i}@example.com`
};
}
}
}
const repo = new UserRepository();
const db = await Valkeyrie.from(repo.getAll(), {
prefix: ['users'],
keyProperty: 'id'
});Valkeyrie.fromAsync() - Asynchronous Data
Use fromAsync() for async iterables: async generators, streams, or async iterators.
From Async Generators
Perfect for paginated APIs:
async function* fetchAllUsers() {
let page = 1;
let hasMore = true;
while (hasMore) {
const response = await fetch(`/api/users?page=${page}`);
const data = await response.json();
for (const user of data.users) {
yield user;
}
hasMore = data.hasMore;
page++;
}
}
const db = await Valkeyrie.fromAsync(fetchAllUsers(), {
prefix: ['users'],
keyProperty: 'id',
path: './users.db',
onProgress: (processed) => {
console.log(`Imported ${processed} users...`);
}
});From Streams
import { createReadStream } from 'node:fs';
import { parse } from 'csv-parse';
import { Readable } from 'node:stream';
// Parse CSV file
const stream = createReadStream('./users.csv')
.pipe(parse({
columns: true,
skip_empty_lines: true
}));
const db = await Valkeyrie.fromAsync(Readable.from(stream), {
prefix: ['users'],
keyProperty: 'id',
path: './users.db'
});From Database Cursors
async function* fetchFromDatabase() {
const cursor = db.collection('users').find().cursor();
for await (const doc of cursor) {
yield doc;
}
}
const valkeyrie = await Valkeyrie.fromAsync(fetchFromDatabase(), {
prefix: ['users'],
keyProperty: '_id',
path: './migrated-users.db'
});Processing Large Datasets
async function* processLargeDataset() {
const batchSize = 100;
let offset = 0;
while (true) {
const batch = await fetchDataBatch(offset, batchSize);
if (batch.length === 0) break;
for (const item of batch) {
// Transform data before inserting
const processed = {
id: item.id,
data: await processItem(item),
processedAt: Date.now()
};
yield processed;
}
offset += batchSize;
}
}
const db = await Valkeyrie.fromAsync(processLargeDataset(), {
prefix: ['processed'],
keyProperty: 'id',
expireIn: 86400000, // 24 hours TTL
onProgress: (processed) => {
if (processed % 1000 === 0) {
console.log(`Processed ${processed} items`);
}
}
});Options Reference
Both from() and fromAsync() accept the same options:
| Option | Type | Required | Description |
|---|---|---|---|
prefix | Key | Yes | Key prefix for all entries (e.g., ['users']) |
keyProperty | keyof T | (item: T) => KeyPart | Yes | Property name or function to extract key part |
path | string | No | Database file path. If omitted, creates in-memory database |
serializer | () => Serializer | No | Custom serializer (default: v8 serializer) |
destroyOnClose | boolean | No | Destroy database file on close (default: false) |
expireIn | number | No | TTL for all entries in milliseconds |
onProgress | (processed: number, total?: number) => void | No | Progress callback. total is only provided for sync iterables with known size |
onError | 'stop' | 'continue' | No | Error handling strategy (default: 'stop') |
onErrorCallback | (error: Error, item: T) => void | No | Called for each error when onError: 'continue' |
Key Extraction
The keyProperty option determines how keys are extracted from items.
Using Property Names
const users = [
{ id: 1, name: 'Alice' },
{ id: 2, name: 'Bob' }
];
// Use 'id' property as key
const db = await Valkeyrie.from(users, {
prefix: ['users'],
keyProperty: 'id' // Property name
});
// Results in keys: ['users', 1], ['users', 2]Using Functions
For more control, use a function:
const users = [
{ firstName: 'Alice', lastName: 'Smith' },
{ firstName: 'Bob', lastName: 'Jones' }
];
// Use combined name as key
const db = await Valkeyrie.from(users, {
prefix: ['users'],
keyProperty: (user) => `${user.firstName}-${user.lastName}`
});
// Results in keys: ['users', 'Alice-Smith'], ['users', 'Bob-Jones']Transforming Keys
const users = [
{ email: 'Alice@EXAMPLE.COM' },
{ email: 'Bob@EXAMPLE.COM' }
];
// Normalize emails
const db = await Valkeyrie.from(users, {
prefix: ['users'],
keyProperty: (user) => user.email.toLowerCase()
});
// Results in keys: ['users', 'alice@example.com'], ['users', 'bob@example.com']Valid Key Parts
The extracted key part must be one of:
stringnumberbigintbooleanUint8Array
// ❌ Invalid - returns an object
keyProperty: (item) => ({ id: item.id })
// ✅ Valid - returns a string
keyProperty: (item) => String(item.id)
// ✅ Valid - returns a number
keyProperty: 'id'
// ✅ Valid - returns a bigint
keyProperty: (item) => BigInt(item.id)Progress Tracking
Synchronous Iterables
For arrays and other sync iterables with known size, total is provided:
const users = new Array(10000).fill(null).map((_, i) => ({
id: i,
name: `User ${i}`
}));
const db = await Valkeyrie.from(users, {
prefix: ['users'],
keyProperty: 'id',
onProgress: (processed, total) => {
const percent = ((processed / total!) * 100).toFixed(1);
console.log(`Progress: ${processed}/${total} (${percent}%)`);
}
});Asynchronous Iterables
For async iterables, total is not known:
const db = await Valkeyrie.fromAsync(fetchAllUsers(), {
prefix: ['users'],
keyProperty: 'id',
onProgress: (processed) => {
// No total available
console.log(`Processed ${processed} items...`);
// Update every 100 items
if (processed % 100 === 0) {
console.log(`Milestone: ${processed} items processed`);
}
}
});Progress Bar Integration
import ProgressBar from 'progress';
const users = [...]; // Array with known size
const bar = new ProgressBar('Importing [:bar] :current/:total :percent', {
total: users.length
});
const db = await Valkeyrie.from(users, {
prefix: ['users'],
keyProperty: 'id',
onProgress: (processed, total) => {
bar.tick();
}
});Error Handling
Stop on Error (Default)
By default, the import stops immediately when an error occurs:
const users = [
{ id: 1, name: 'Alice' },
{ id: 2 }, // Missing name - might cause validation error
{ id: 3, name: 'Charlie' }
];
try {
const db = await Valkeyrie.from(users, {
prefix: ['users'],
keyProperty: 'id'
});
} catch (error) {
console.error('Import failed:', error);
// Database is automatically closed on error
}Continue on Error
Skip invalid items and continue:
const errors: Array<{ item: any; error: Error }> = [];
const db = await Valkeyrie.from(users, {
prefix: ['users'],
keyProperty: 'id',
onError: 'continue',
onErrorCallback: (error, item) => {
console.error(`Failed to import item:`, item, error);
errors.push({ item, error });
}
});
console.log(`Imported successfully, ${errors.length} items failed`);Validation Errors with Schemas
import { z } from 'zod';
import { ValidationError } from 'valkeyrie';
const userSchema = z.object({
id: z.number(),
email: z.string().email()
});
const users = [
{ id: 1, email: 'valid@example.com' },
{ id: 2, email: 'invalid-email' }, // Invalid!
{ id: 3, email: 'another@example.com' }
];
const invalidItems: any[] = [];
const db = await Valkeyrie
.withSchema(['users', '*'], userSchema)
.from(users, {
prefix: ['users'],
keyProperty: 'id',
onError: 'continue',
onErrorCallback: (error, item) => {
if (error instanceof ValidationError) {
console.log(`Validation failed for item ${item.id}:`, error.issues);
invalidItems.push(item);
}
}
});
console.log(`Imported ${users.length - invalidItems.length} valid items`);Performance
Factory methods are optimized for performance:
Automatic Batching
Items are automatically batched into groups of 1000 and inserted using atomic operations:
// Under the hood, this:
const db = await Valkeyrie.from(largeArray, {
prefix: ['items'],
keyProperty: 'id'
});
// Does this:
// Batch 1: items 0-999 (atomic operation)
// Batch 2: items 1000-1999 (atomic operation)
// Batch 3: items 2000-2999 (atomic operation)
// ... and so onThis provides excellent performance while maintaining atomicity within each batch.
Memory Efficiency
For async iterables, items are processed as they arrive:
// Memory-efficient: processes items one at a time
async function* hugeDataset() {
for (let i = 0; i < 1000000; i++) {
yield { id: i, data: generateData() };
}
}
const db = await Valkeyrie.fromAsync(hugeDataset(), {
prefix: ['items'],
keyProperty: 'id'
});Performance Tips
Use async iterables for large datasets
typescript// ❌ Loads everything into memory const allData = await fetchAllData(); const db = await Valkeyrie.from(allData, options); // ✅ Streams data const db = await Valkeyrie.fromAsync(streamData(), options);Batch external API calls
typescriptasync function* fetchInBatches() { const batchSize = 100; for (let page = 0; page < totalPages; page++) { const batch = await fetch(`/api/items?page=${page}&size=${batchSize}`); for (const item of batch) { yield item; } } }Choose the right serializer
typescript// Faster for simple data const db = await Valkeyrie.from(data, { prefix: ['items'], keyProperty: 'id', serializer: () => jsonSerializer() });
With Schema Validation
Factory methods work seamlessly with schema validation:
import { z } from 'zod';
const userSchema = z.object({
id: z.number(),
name: z.string().min(1),
email: z.string().email(),
age: z.number().int().min(0).optional()
});
const users = [
{ id: 1, name: 'Alice', email: 'alice@example.com', age: 30 },
{ id: 2, name: 'Bob', email: 'bob@example.com' },
{ id: 3, name: 'Charlie', email: 'charlie@example.com', age: 25 }
];
// All items are validated during import
const db = await Valkeyrie
.withSchema(['users', '*'], userSchema)
.from(users, {
prefix: ['users'],
keyProperty: 'id',
path: './users.db'
});
// Type inference works!
const user = await db.get(['users', 1]);
// user.value is typed as the schema output typeReal-World Examples
Migrating from MongoDB
import { MongoClient } from 'mongodb';
async function migrateFromMongoDB() {
const mongo = await MongoClient.connect('mongodb://localhost:27017');
const collection = mongo.db('myapp').collection('users');
async function* streamUsers() {
const cursor = collection.find();
for await (const doc of cursor) {
yield doc;
}
}
const db = await Valkeyrie.fromAsync(streamUsers(), {
prefix: ['users'],
keyProperty: (user) => user._id.toString(),
path: './migrated-db.db',
onProgress: (processed) => {
console.log(`Migrated ${processed} users`);
}
});
await mongo.close();
return db;
}Importing CSV Files
import { createReadStream } from 'node:fs';
import { parse } from 'csv-parse';
import { Readable } from 'node:stream';
async function importCSV(filePath: string) {
const stream = createReadStream(filePath)
.pipe(parse({
columns: true,
skip_empty_lines: true,
cast: true // Automatically convert types
}));
return await Valkeyrie.fromAsync(Readable.from(stream), {
prefix: ['records'],
keyProperty: 'id',
path: './imported.db',
onProgress: (processed) => {
if (processed % 1000 === 0) {
console.log(`Imported ${processed} records`);
}
}
});
}Seeding Test Data
import { faker } from '@faker-js/faker';
function* generateTestUsers(count: number) {
for (let i = 0; i < count; i++) {
yield {
id: i + 1,
name: faker.person.fullName(),
email: faker.internet.email(),
createdAt: faker.date.past()
};
}
}
// Create a test database with 10,000 users
const testDb = await Valkeyrie.from(generateTestUsers(10000), {
prefix: ['users'],
keyProperty: 'id',
path: './test-data.db',
destroyOnClose: true // Clean up after tests
});Caching API Responses
async function* fetchAndCacheProducts() {
const response = await fetch('https://api.example.com/products');
const products = await response.json();
for (const product of products) {
yield product;
}
}
const cache = await Valkeyrie.fromAsync(fetchAndCacheProducts(), {
prefix: ['products'],
keyProperty: 'id',
expireIn: 3600000, // 1 hour cache
onProgress: (processed) => {
console.log(`Cached ${processed} products`);
}
});Building Search Indexes
const posts = [
{ id: 1, title: 'Hello World', content: 'This is my first post' },
{ id: 2, title: 'Another Post', content: 'More content here' }
];
// Index by ID
const byId = await Valkeyrie.from(posts, {
prefix: ['posts', 'by-id'],
keyProperty: 'id'
});
// Index by title (normalized)
const byTitle = await Valkeyrie.from(posts, {
prefix: ['posts', 'by-title'],
keyProperty: (post) => post.title.toLowerCase().replace(/\s+/g, '-')
});
// Now you can search by both
const post1 = await byId.get(['posts', 'by-id', 1]);
const post2 = await byTitle.get(['posts', 'by-title', 'hello-world']);Best Practices
1. Choose the Right Method
// ✅ Use from() for arrays and known-size data
const db = await Valkeyrie.from(arrayData, options);
// ✅ Use fromAsync() for streams and large datasets
const db = await Valkeyrie.fromAsync(streamData(), options);2. Handle Errors Appropriately
// For production: log and continue
const db = await Valkeyrie.from(data, {
prefix: ['items'],
keyProperty: 'id',
onError: 'continue',
onErrorCallback: (error, item) => {
logger.error('Import failed for item', { item, error });
}
});
// For development: stop and debug
const db = await Valkeyrie.from(data, {
prefix: ['items'],
keyProperty: 'id',
onError: 'stop' // Default
});3. Use TTL for Temporary Data
const cache = await Valkeyrie.from(data, {
prefix: ['cache'],
keyProperty: 'id',
expireIn: 3600000 // 1 hour
});4. Monitor Progress for Long Operations
const db = await Valkeyrie.from(largeDataset, {
prefix: ['items'],
keyProperty: 'id',
onProgress: (processed, total) => {
if (processed % 5000 === 0) {
console.log(`Progress: ${processed}/${total}`);
}
}
});5. Clean Up Test Databases
const testDb = await Valkeyrie.from(testData, {
prefix: ['test'],
keyProperty: 'id',
path: './test.db',
destroyOnClose: true // Automatic cleanup
});
// Use in tests
await using db = await Valkeyrie.from(testData, options);
// Automatically closed and destroyed6. Validate Data with Schemas
// Define what valid data looks like
const db = await Valkeyrie
.withSchema(['users', '*'], userSchema)
.from(externalData, {
prefix: ['users'],
keyProperty: 'id',
onError: 'continue' // Skip invalid items
});Summary
- ✅ How to create databases from arrays, Sets, Maps, and custom iterables
- ✅ How to stream large datasets efficiently with
fromAsync() - ✅ Key extraction strategies and options
- ✅ Progress tracking and error handling
- ✅ Performance optimization techniques
- ✅ Integration with schema validation
- ✅ Real-world migration and import patterns
Next steps:
- Advanced Patterns - Atomic operations and watch API
- Serializers - Choose the right serializer for your use case
