Receipt parser improvements: voided items, UPC, quantity defaults, and model selection

Major improvements to receipt parsing:

Voided Item Handling:
- Add Voided boolean field to ReceiptLineItem model and database
- Never skip any line items - include voided items with voided=true and lineTotal=0.00
- Strong parser hints: "CONTINUE reading", "do NOT stop parsing", "Read ENTIRE receipt"
- Ensures all items after void markers are captured

UPC/Barcode Extraction:
- Extract UPC codes (12-13 digits) and store in Sku field
- Enables price tracking over time even when descriptions change

Quantity Defaults:
- ALWAYS default to 1.0 for ALL retail products (groceries, goods, merchandise)
- Only use null for utility bills, service fees, or taxes
- Emphatic instructions: "MUST be 1.0", "do NOT leave it null"
- Prevents missing quantities on retail items

Model Selection:
- Add AI model dropdown in ViewReceipt UI (gpt-4o-mini vs gpt-4o)
- Update IReceiptParser interface to accept optional model parameter
- Pass selected model through to OpenAI API
- Store model name in parse logs for history tracking
- Allows using smarter model for complex receipts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
AJ
2025-10-19 16:08:56 -04:00
parent d0f4b420f8
commit f09d19ec5c
8 changed files with 698 additions and 31 deletions
+26 -7
View File
@@ -10,19 +10,38 @@ Analyze this receipt image and extract the following information as JSON:
"lineItems": [
{
"description": "item name",
"quantity": 1.0 (or null),
"upc": "1234567890123" (or null if not found),
"quantity": 1.0,
"unitPrice": 0.00 (or null),
"lineTotal": 0.00
"lineTotal": 0.00,
"voided": false
}
]
}
Extract all line items you can see on the receipt. For each item:
- description: The item or service name
- quantity: Only include if this is an actual countable product (like groceries). For services, fees, charges, or taxes, set to null.
- unitPrice: Price per unit if quantity applies, otherwise null
- lineTotal: The total amount for this line (required)
- description: The item or service name (include any count/size info in the description itself, like "4CT" or "12 OZ")
- upc: The UPC/barcode number if visible (usually a 12-13 digit number near the item). This helps track price changes over time. Set to null if not found.
- quantity: ALWAYS set to 1.0 for ALL retail products (groceries, goods, merchandise, etc.) - this is the default. ONLY use null for utility bills, service fees, or taxes (non-product items). If it's a physical item on a retail receipt, use 1.0.
- unitPrice: Calculate as lineTotal divided by quantity (so usually equals lineTotal for retail items). Set to null only if quantity is null.
- lineTotal: The total amount for this line (the price shown on the receipt, or 0.00 if voided)
- voided: Set to true if this item appears immediately after a "** VOIDED ENTRY **" marker or similar void indicator. Set to false for all other items.
For utility bills, service charges, fees, and taxes - these are NOT products with quantities, so set quantity and unitPrice to null.
CRITICAL - HANDLING VOIDED ITEMS:
- NEVER skip or ignore ANY line items on the receipt
- When you see "** VOIDED ENTRY **" or similar void markers, the item immediately after it is voided
- For voided items: set "voided": true and "lineTotal": 0.00
- For all other items: set "voided": false
- CONTINUE reading and extracting ALL items that appear after void markers - do NOT stop parsing
- The receipt may have many items listed after a void marker - you MUST include every single one
- Include EVERY line item you can see, whether voided or not
OTHER IMPORTANT RULES:
- Quantity MUST be 1.0 for ALL physical retail items (groceries, food, household goods, etc.) - do NOT leave it null
- Every item on a grocery/retail receipt gets quantity: 1.0 unless you see explicit indicators like "2 @" or "QTY 3"
- Only utility bills, service charges, fees, or taxes (non-product line items) should have null quantity
- Don't confuse product descriptions (like "4CT BLUE MUF" meaning 4-count muffin package) with quantity fields (like "2 @ $3.99")
- Extract UPC/barcode numbers when visible - they're usually long numeric codes (12-13 digits)
- Read through the ENTIRE receipt from top to bottom - don't stop early
If this is a bill (utility, credit card, etc.), look for a due date, payment due date, or deadline and extract it as dueDate. For regular receipts, dueDate should be null.