@push.rocks/smartai

A TypeScript library for integrating and interacting with multiple AI models, offering capabilities for chat and potentially audio responses.

readme.md for @push.rocks/smartai

A unified provider registry for the Vercel AI SDK 🧠⚑

npm version TypeScript License: MIT

SmartAI gives you a single getModel() function that returns a standard LanguageModelV3 for any supported provider β€” Anthropic, OpenAI, Google, Groq, Mistral, XAI, Perplexity, or Ollama. Use the returned model with the Vercel AI SDK's generateText(), streamText(), and tool ecosystem. Specialized capabilities like vision, audio, image generation, document analysis, and web research are available as dedicated subpath imports.

Issue Reporting and Security

For reporting bugs, issues, or security vulnerabilities, please visit community.foss.global/. This is the central community hub for all issue reporting. Developers who sign and comply with our contribution agreement and go through identification can also get a code.foss.global/ account to submit Pull Requests directly.

🎯 Why SmartAI?

πŸ“¦ Installation

pnpm install @push.rocks/smartai

πŸš€ Quick Start

import { getModel, generateText, streamText } from '@push.rocks/smartai';

// Get a model for any provider
const model = getModel({
  provider: 'anthropic',
  model: 'claude-sonnet-4-5-20250929',
  apiKey: process.env.ANTHROPIC_TOKEN,
});

// Use it with the standard AI SDK functions
const result = await generateText({
  model,
  prompt: 'Explain quantum computing in simple terms.',
});

console.log(result.text);

That's it. Change provider to 'openai' and model to 'gpt-4o' and the rest of your code stays exactly the same.

πŸ”§ Core API

getModel(options): LanguageModelV3

The primary export. Returns a standard LanguageModelV3 you can use with any AI SDK function.

import { getModel } from '@push.rocks/smartai';
import type { ISmartAiOptions } from '@push.rocks/smartai';

const options: ISmartAiOptions = {
  provider: 'anthropic',  // 'anthropic' | 'openai' | 'google' | 'groq' | 'mistral' | 'xai' | 'perplexity' | 'ollama'
  model: 'claude-sonnet-4-5-20250929',
  apiKey: 'sk-ant-...',
  // Anthropic-only: prompt caching (default: true)
  promptCaching: true,
  // Ollama-only: base URL (default: http://localhost:11434)
  baseUrl: 'http://localhost:11434',
  // Ollama-only: model runtime options
  ollamaOptions: { think: true, num_ctx: 4096 },
};

const model = getModel(options);

Re-exported AI SDK Functions

SmartAI re-exports the most commonly used functions from ai for convenience:

import {
  getModel,
  generateText,
  streamText,
  tool,
  jsonSchema,
} from '@push.rocks/smartai';

import type {
  ModelMessage,
  ToolSet,
  StreamTextResult,
  LanguageModelV3,
} from '@push.rocks/smartai';

πŸ€– Supported Providers

Provider Package Example Models
Anthropic @ai-sdk/anthropic claude-sonnet-4-5-20250929, claude-opus-4-5-20250929
OpenAI @ai-sdk/openai gpt-4o, gpt-4o-mini, o3-mini
Google @ai-sdk/google gemini-2.0-flash, gemini-2.5-pro
Groq @ai-sdk/groq llama-3.3-70b-versatile, mixtral-8x7b-32768
Mistral @ai-sdk/mistral mistral-large-latest, mistral-small-latest
XAI @ai-sdk/xai grok-3, grok-3-mini
Perplexity @ai-sdk/perplexity sonar-pro, sonar
Ollama Custom LanguageModelV3 qwen3:8b, llama3:8b, deepseek-r1

πŸ’¬ Text Generation

Generate Text

import { getModel, generateText } from '@push.rocks/smartai';

const model = getModel({
  provider: 'openai',
  model: 'gpt-4o',
  apiKey: process.env.OPENAI_TOKEN,
});

const result = await generateText({
  model,
  system: 'You are a helpful assistant.',
  prompt: 'What is 2 + 2?',
});

console.log(result.text); // "4"

Stream Text

import { getModel, streamText } from '@push.rocks/smartai';

const model = getModel({
  provider: 'anthropic',
  model: 'claude-sonnet-4-5-20250929',
  apiKey: process.env.ANTHROPIC_TOKEN,
});

const result = await streamText({
  model,
  prompt: 'Count from 1 to 10.',
});

for await (const chunk of result.textStream) {
  process.stdout.write(chunk);
}

Tool Calling

import { getModel, generateText, tool, jsonSchema } from '@push.rocks/smartai';

const model = getModel({
  provider: 'anthropic',
  model: 'claude-sonnet-4-5-20250929',
  apiKey: process.env.ANTHROPIC_TOKEN,
});

const result = await generateText({
  model,
  prompt: 'What is the weather in London?',
  tools: {
    getWeather: tool({
      description: 'Get weather for a location',
      parameters: jsonSchema({
        type: 'object',
        properties: {
          location: { type: 'string' },
        },
        required: ['location'],
      }),
      execute: async ({ location }) => {
        return { temperature: 18, condition: 'cloudy' };
      },
    }),
  },
});

🏠 Ollama (Local Models)

The custom Ollama provider implements LanguageModelV3 directly, calling Ollama's native /api/chat endpoint. This gives you features that generic OpenAI-compatible wrappers miss:

import { getModel, generateText } from '@push.rocks/smartai';

const model = getModel({
  provider: 'ollama',
  model: 'qwen3:8b',
  baseUrl: 'http://localhost:11434', // default
  ollamaOptions: {
    think: true,      // Enable thinking/reasoning mode
    num_ctx: 8192,     // Context window size
    temperature: 0.7,  // Override default (Qwen models auto-default to 0.55)
  },
});

const result = await generateText({
  model,
  prompt: 'Solve this step by step: what is 15% of 340?',
});

console.log(result.text);

Ollama Features

πŸ’° Anthropic Prompt Caching

When using the Anthropic provider, SmartAI automatically wraps the model with caching middleware that adds cacheControl: { type: 'ephemeral' } to the last system message and last user message. This can significantly reduce cost and latency for repeated calls with the same system prompt.

// Caching enabled by default
const model = getModel({
  provider: 'anthropic',
  model: 'claude-sonnet-4-5-20250929',
  apiKey: process.env.ANTHROPIC_TOKEN,
});

// Opt out of caching
const modelNoCaching = getModel({
  provider: 'anthropic',
  model: 'claude-sonnet-4-5-20250929',
  apiKey: process.env.ANTHROPIC_TOKEN,
  promptCaching: false,
});

You can also use the middleware directly:

import { createAnthropicCachingMiddleware } from '@push.rocks/smartai';
import { wrapLanguageModel } from 'ai';

const middleware = createAnthropicCachingMiddleware();
const cachedModel = wrapLanguageModel({ model: baseModel, middleware });

πŸ“¦ Subpath Exports

SmartAI provides specialized capabilities as separate subpath imports. Each one is a focused utility that takes a model (or API key) and does one thing well.

πŸ‘οΈ Vision β€” @push.rocks/smartai/vision

Analyze images using any vision-capable model.

import { analyzeImage } from '@push.rocks/smartai/vision';
import { getModel } from '@push.rocks/smartai';
import * as fs from 'fs';

const model = getModel({
  provider: 'anthropic',
  model: 'claude-sonnet-4-5-20250929',
  apiKey: process.env.ANTHROPIC_TOKEN,
});

const description = await analyzeImage({
  model,
  image: fs.readFileSync('photo.jpg'),
  prompt: 'Describe this image in detail.',
  mediaType: 'image/jpeg', // optional, defaults to 'image/jpeg'
});

console.log(description);

analyzeImage(options) accepts:

πŸŽ™οΈ Audio β€” @push.rocks/smartai/audio

Text-to-speech using OpenAI's TTS models.

import { textToSpeech } from '@push.rocks/smartai/audio';
import * as fs from 'fs';

const stream = await textToSpeech({
  apiKey: process.env.OPENAI_TOKEN,
  text: 'Welcome to the future of AI development!',
  voice: 'nova',     // 'alloy' | 'echo' | 'fable' | 'onyx' | 'nova' | 'shimmer'
  model: 'tts-1-hd', // 'tts-1' | 'tts-1-hd'
  responseFormat: 'mp3', // 'mp3' | 'opus' | 'aac' | 'flac'
  speed: 1.0,         // 0.25 to 4.0
});

stream.pipe(fs.createWriteStream('welcome.mp3'));

🎨 Image β€” @push.rocks/smartai/image

Generate and edit images using OpenAI's image models.

import { generateImage, editImage } from '@push.rocks/smartai/image';

// Generate an image
const result = await generateImage({
  apiKey: process.env.OPENAI_TOKEN,
  prompt: 'A futuristic cityscape at sunset, digital art',
  model: 'gpt-image-1',     // 'gpt-image-1' | 'dall-e-3' | 'dall-e-2'
  quality: 'high',           // 'low' | 'medium' | 'high' | 'auto'
  size: '1024x1024',
  background: 'transparent', // gpt-image-1 only
  outputFormat: 'png',       // 'png' | 'jpeg' | 'webp'
  n: 1,
});

// result.images[0].b64_json β€” base64-encoded image data
const imageBuffer = Buffer.from(result.images[0].b64_json!, 'base64');

// Edit an existing image
const edited = await editImage({
  apiKey: process.env.OPENAI_TOKEN,
  image: imageBuffer,
  prompt: 'Add a rainbow in the sky',
  model: 'gpt-image-1',
});

πŸ“„ Document β€” @push.rocks/smartai/document

Analyze PDF documents by converting them to images and using a vision model. Uses @push.rocks/smartpdf for PDF-to-PNG conversion (requires Chromium/Puppeteer).

import { analyzeDocuments, stopSmartpdf } from '@push.rocks/smartai/document';
import { getModel } from '@push.rocks/smartai';
import * as fs from 'fs';

const model = getModel({
  provider: 'anthropic',
  model: 'claude-sonnet-4-5-20250929',
  apiKey: process.env.ANTHROPIC_TOKEN,
});

const analysis = await analyzeDocuments({
  model,
  systemMessage: 'You are a legal document analyst.',
  userMessage: 'Summarize the key terms and conditions.',
  pdfDocuments: [fs.readFileSync('contract.pdf')],
  messageHistory: [],  // optional: prior conversation context
});

console.log(analysis);

// Clean up the SmartPdf instance when done
await stopSmartpdf();

πŸ”¬ Research β€” @push.rocks/smartai/research

Perform web-search-powered research using Anthropic's web_search_20250305 tool.

import { research } from '@push.rocks/smartai/research';

const result = await research({
  apiKey: process.env.ANTHROPIC_TOKEN,
  query: 'What are the latest developments in quantum computing?',
  searchDepth: 'basic',     // 'basic' | 'advanced' | 'deep'
  maxSources: 10,           // optional: limit number of search results
  allowedDomains: ['nature.com', 'arxiv.org'],  // optional: restrict to domains
  blockedDomains: ['reddit.com'],               // optional: exclude domains
});

console.log(result.answer);
console.log('Sources:', result.sources);       // Array<{ url, title, snippet }>
console.log('Queries:', result.searchQueries); // search queries the model used

πŸ§ͺ Testing

# All tests
pnpm test

# Individual test files
tstest test/test.smartai.ts --verbose    # Core getModel + generateText + streamText
tstest test/test.ollama.ts --verbose     # Ollama provider (mocked, no API needed)
tstest test/test.vision.ts --verbose     # Vision analysis
tstest test/test.image.ts --verbose      # Image generation
tstest test/test.research.ts --verbose   # Web research
tstest test/test.audio.ts --verbose      # Text-to-speech
tstest test/test.document.ts --verbose   # Document analysis (needs Chromium)

Most tests skip gracefully when API keys are not set. The Ollama tests are fully mocked and require no external services.

πŸ“ Architecture

@push.rocks/smartai
β”œβ”€β”€ ts/                          # Core package
β”‚   β”œβ”€β”€ index.ts                 # Re-exports getModel, AI SDK functions, types
β”‚   β”œβ”€β”€ smartai.classes.smartai.ts  # getModel() β€” provider switch
β”‚   β”œβ”€β”€ smartai.interfaces.ts    # ISmartAiOptions, TProvider, IOllamaModelOptions
β”‚   β”œβ”€β”€ smartai.provider.ollama.ts  # Custom LanguageModelV3 for Ollama
β”‚   β”œβ”€β”€ smartai.middleware.anthropic.ts  # Prompt caching middleware
β”‚   └── plugins.ts               # AI SDK provider factories
β”œβ”€β”€ ts_vision/                   # @push.rocks/smartai/vision
β”œβ”€β”€ ts_audio/                    # @push.rocks/smartai/audio
β”œβ”€β”€ ts_image/                    # @push.rocks/smartai/image
β”œβ”€β”€ ts_document/                 # @push.rocks/smartai/document
└── ts_research/                 # @push.rocks/smartai/research

The core package is a thin registry. getModel() creates the appropriate @ai-sdk/* provider, calls it with the model ID, and returns the resulting LanguageModelV3. For Anthropic, it optionally wraps the model with prompt caching middleware. For Ollama, it returns a custom LanguageModelV3 implementation that talks directly to Ollama's /api/chat endpoint.

Subpath modules are independent β€” they import ai and provider SDKs directly, not through the core package. This keeps the dependency graph clean and allows tree-shaking.

This repository contains open-source code licensed under the MIT License. A copy of the license can be found in the LICENSE file.

Please note: The MIT License does not grant permission to use the trade names, trademarks, service marks, or product names of the project, except as required for reasonable and customary use in describing the origin of the work and reproducing the content of the NOTICE file.

Trademarks

This project is owned and maintained by Task Venture Capital GmbH. The names and logos associated with Task Venture Capital GmbH and any related products or services are trademarks of Task Venture Capital GmbH or third parties, and are not included within the scope of the MIT license granted herein.

Use of these trademarks must comply with Task Venture Capital GmbH's Trademark Guidelines or the guidelines of the respective third-party owners, and any usage must be approved in writing. Third-party trademarks used herein are the property of their respective owners and used only in a descriptive manner, e.g. for an implementation of an API or similar.

Company Information

Task Venture Capital GmbH Registered at District Court Bremen HRB 35230 HB, Germany

By using this repository, you acknowledge that you have read this section, agree to comply with its terms, and understand that the licensing of the code does not imply endorsement by Task Venture Capital GmbH of any derivative works.

changelog.md for @push.rocks/smartai

2026-03-05 - 2.0.0 - BREAKING CHANGE(vercel-ai-sdk)

migrate to Vercel AI SDK v6 and introduce provider registry (getModel) returning LanguageModelV3

2026-01-20 - 0.13.3 - fix()

no changes detected

2026-01-20 - 0.13.2 - fix(repo)

no changes detected in diff; nothing to commit

2026-01-20 - 0.13.1 - fix()

no changes detected; no release required

2026-01-20 - 0.13.0 - feat(provider.ollama)

add chain-of-thought reasoning support to chat messages and Ollama provider

2026-01-20 - 0.12.1 - fix(docs)

update documentation: clarify provider capabilities, add provider capabilities summary, polish examples and formatting, and remove Serena project config

2026-01-20 - 0.12.0 - feat(ollama)

add support for base64-encoded images in chat messages and forward them to the Ollama provider

2026-01-20 - 0.11.0 - feat(ollama)

support defaultOptions and defaultTimeout for ollama provider

2026-01-20 - 0.10.1 - fix()

no changes detected β€” no release necessary

2026-01-18 - 0.10.0 - feat(mistral)

add Mistral provider with native PDF OCR and chat integration

2026-01-18 - 0.9.0 - feat(providers)

Add Anthropic extended thinking and adapt providers to new streaming/file APIs; bump dependencies and update docs, tests and configuration

2025-10-30 - 0.8.0 - feat(provider.anthropic)

Add extended thinking modes to AnthropicProvider and apply thinking budgets to API calls

2025-10-10 - 0.7.7 - fix(MultiModalModel)

Lazy-load SmartPdf and guard document processing across providers; ensure SmartPdf is initialized only when needed

2025-10-09 - 0.7.6 - fix(provider.elevenlabs)

Provide default ElevenLabs TTS voice fallback and add local tool/project configs

2025-10-08 - 0.7.5 - fix(provider.elevenlabs)

Update ElevenLabs default TTS model to eleven_v3 and add local Claude permissions file

2025-10-03 - 0.7.4 - fix(provider.anthropic)

Use image/png for embedded PDF images in Anthropic provider and add local Claude settings for development permissions

2025-10-03 - 0.7.3 - fix(tests)

Add extensive provider/feature tests and local Claude CI permissions

2025-10-03 - 0.7.2 - fix(anthropic)

Update Anthropic provider branding to Claude Sonnet 4.5 and add local Claude permissions

2025-10-03 - 0.7.1 - fix(docs)

Add README image generation docs and .claude local settings

2025-10-03 - 0.7.0 - feat(providers)

Add research API and image generation/editing support; extend providers and tests

2025-09-28 - 0.6.1 - fix(provider.anthropic)

Fix Anthropic research tool identifier and add tests + local Claude permissions

2025-09-28 - 0.6.0 - feat(research)

Introduce research API with provider implementations, docs and tests

2025-08-12 - 0.5.11 - fix(openaiProvider)

Update default chat model to gpt-5-mini and bump dependency versions

2025-08-03 - 0.5.10 - fix(dependencies)

Update SmartPdf to v4.1.1 for enhanced PDF processing capabilities

2025-08-01 - 0.5.9 - fix(documentation)

Remove contribution section from readme

2025-08-01 - 0.5.8 - fix(core)

Fix SmartPdf lifecycle management and update dependencies

2025-07-26 - 0.5.7 - fix(provider.openai)

Fix stream type mismatch in audio method

2025-07-25 - 0.5.5 - feat(documentation)

Comprehensive documentation enhancement and test improvements

2025-05-13 - 0.5.4 - fix(provider.openai)

Update dependency versions, clean test imports, and adjust default OpenAI model configurations

2025-04-03 - 0.5.3 - fix(package.json)

Add explicit packageManager field to package.json

2025-04-03 - 0.5.2 - fix(readme)

Remove redundant conclusion section from README to streamline documentation.

2025-02-25 - 0.5.1 - fix(OpenAiProvider)

Corrected audio model ID in OpenAiProvider

2025-02-25 - 0.5.0 - feat(documentation and configuration)

Enhanced package and README documentation

2025-02-25 - 0.4.2 - fix(core)

Fix OpenAI chat streaming and PDF document processing logic.

2025-02-25 - 0.4.1 - fix(provider)

Fix provider modules for consistency

2025-02-08 - 0.4.0 - feat(core)

Added support for Exo AI provider

2025-02-05 - 0.3.3 - fix(documentation)

Update readme with detailed license and legal information.

2025-02-05 - 0.3.2 - fix(documentation)

Remove redundant badges from readme

2025-02-05 - 0.3.1 - fix(documentation)

Updated README structure and added detailed usage examples

2025-02-05 - 0.3.0 - feat(integration-xai)

Add support for X.AI provider with chat and document processing capabilities.

2025-02-03 - 0.2.0 - feat(provider.anthropic)

Add support for vision and document processing in Anthropic provider

2025-02-03 - 0.1.0 - feat(providers)

Add vision and document processing capabilities to providers

2025-02-03 - 0.0.19 - fix(core)

Enhanced chat streaming and error handling across providers

2024-09-19 - 0.0.18 - fix(dependencies)

Update dependencies to the latest versions.

2024-05-29 - 0.0.17 - Documentation

Updated project description.

2024-05-17 - 0.0.16 to 0.0.15 - Core

Fixes and updates.

2024-04-29 - 0.0.14 to 0.0.13 - Core

Fixes and updates.

2024-04-29 - 0.0.12 - Core

Fixes and updates.

2024-04-29 - 0.0.11 - Provider

Fix integration for anthropic provider.

2024-04-27 - 0.0.10 to 0.0.9 - Core

Fixes and updates.

2024-04-01 - 0.0.8 to 0.0.7 - Core and npmextra

Core updates and npmextra configuration.

2024-03-31 - 0.0.6 to 0.0.2 - Core

Initial core updates and fixes.

This summarizes the relevant updates and changes based on the provided commit messages. The changelog excludes commits that are version tags without meaningful content or repeated entries.