Feature: Database-aware receipt parsing with tool-use and enriched prompts

AIReceiptParser now routes to tool-aware or standard vision clients.
Tool-capable models (OpenAI, Claude, LlamaCpp) call search_categories,
search_transactions, and search_merchants during parsing. Ollama gets
pre-fetched DB context injected into the prompt. Adds suggestedCategory
and suggestedTransactionId fields with AI-driven transaction mapping.
Includes NullableLongConverter for resilient JSON deserialization and
restructured receipt prompt with strict field types.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-15 19:13:56 -05:00
parent 167b7c2ec1
commit 396d5cfc1d
2 changed files with 203 additions and 34 deletions
+44 -29
View File
@@ -1,47 +1,62 @@
Analyze this receipt image and extract the following information as JSON:
Analyze this receipt image and extract structured data. Respond with a single JSON object matching this exact schema. Use JSON null (not the string "null") for missing values. Do not include comments in the JSON.
{
"merchant": "store name",
"receiptDate": "YYYY-MM-DD" (or null if not found),
"dueDate": "YYYY-MM-DD" (or null if not found - for bills only),
"subtotal": 0.00 (or null if not found),
"tax": 0.00 (or null if not found),
"receiptDate": "YYYY-MM-DD",
"dueDate": null,
"subtotal": 0.00,
"tax": 0.00,
"total": 0.00,
"confidence": 0.95,
"suggestedCategory": null,
"suggestedTransactionId": null,
"lineItems": [
{
"description": "item name",
"upc": "1234567890123" (or null if not found),
"upc": null,
"quantity": 1.0,
"unitPrice": 0.00 (or null),
"unitPrice": 0.00,
"lineTotal": 0.00,
"category": null,
"voided": false
}
]
}
Extract all line items you can see on the receipt. For each item:
- description: The item or service name (include any count/size info in the description itself, like "4CT" or "12 OZ")
- upc: The UPC/barcode number if visible (usually a 12-13 digit number near the item). This helps track price changes over time. Set to null if not found.
- quantity: ALWAYS set to 1.0 for ALL retail products (groceries, goods, merchandise, etc.) - this is the default. ONLY use null for utility bills, service fees, or taxes (non-product items). If it's a physical item on a retail receipt, use 1.0.
- unitPrice: Calculate as lineTotal divided by quantity (so usually equals lineTotal for retail items). Set to null only if quantity is null.
- lineTotal: The total amount for this line (the price shown on the receipt, or 0.00 if voided)
- voided: Set to true if this item appears immediately after a "** VOIDED ENTRY **" marker or similar void indicator. Set to false for all other items.
FIELD TYPES (you must follow these exactly):
- merchant: string
- receiptDate: string "YYYY-MM-DD" or null
- dueDate: string "YYYY-MM-DD" or null (only for bills with a payment deadline)
- subtotal: number or null
- tax: number or null
- total: number
- confidence: number between 0 and 1
- suggestedCategory: string or null
- suggestedTransactionId: integer or null (MUST be a JSON number like 123, NEVER a string like "123")
- lineItems: array of objects
CRITICAL - HANDLING VOIDED ITEMS:
- NEVER skip or ignore ANY line items on the receipt
- When you see "** VOIDED ENTRY **" or similar void markers, the item immediately after it is voided
LINE ITEM FIELDS:
- description: string (the item or service name, include count/size info like "4CT" or "12 OZ")
- upc: string or null (UPC/barcode number if visible, usually 12-13 digits)
- quantity: number (default 1.0 for all retail products; null only for service fees or taxes)
- unitPrice: number or null (lineTotal divided by quantity; null only if quantity is null)
- lineTotal: number (the price shown on the receipt; 0.00 if voided)
- category: string or null
- voided: boolean
RULES FOR LINE ITEMS:
- Extract ALL line items from top to bottom - never stop early
- quantity is 1.0 for ALL physical retail items unless you see "2 @" or "QTY 3" etc.
- Do not confuse product descriptions (like "4CT BLUE MUF" = 4-count muffin package) with quantity
- UPC/barcode numbers are long numeric codes (12-13 digits) near the item
VOIDED ITEMS:
- When you see "** VOIDED ENTRY **" or similar, the item immediately after it is voided
- For voided items: set "voided": true and "lineTotal": 0.00
- For all other items: set "voided": false
- CONTINUE reading and extracting ALL items that appear after void markers - do NOT stop parsing
- The receipt may have many items listed after a void marker - you MUST include every single one
- Include EVERY line item you can see, whether voided or not
- NEVER skip voided items - include them in the lineItems array
- CONTINUE reading ALL items after void markers
OTHER IMPORTANT RULES:
- Quantity MUST be 1.0 for ALL physical retail items (groceries, food, household goods, etc.) - do NOT leave it null
- Every item on a grocery/retail receipt gets quantity: 1.0 unless you see explicit indicators like "2 @" or "QTY 3"
- Only utility bills, service charges, fees, or taxes (non-product line items) should have null quantity
- Don't confuse product descriptions (like "4CT BLUE MUF" meaning 4-count muffin package) with quantity fields (like "2 @ $3.99")
- Extract UPC/barcode numbers when visible - they're usually long numeric codes (12-13 digits)
- Read through the ENTIRE receipt from top to bottom - don't stop early
If this is a bill (utility, credit card, etc.), look for a due date, payment due date, or deadline and extract it as dueDate. For regular receipts, dueDate should be null.
DUE DATE:
- Only for bills (utility, credit card, etc.) - extract the payment due date
- For regular store receipts, dueDate must be null