Data Migration & Bulk Ops
Every ERPNext implementation involves getting data into the system — whether it is master data during initial setup, historical transactions from a legacy system, or ongoing bulk operations. This chapter covers every approach from the built-in Data Import Tool to custom migration scripts and writing framework patches. If you’ve ever written a one-off Node.js script to seed a Postgres database, this is the Frappe equivalent — but with several layers of safety (and several ways to bypass them).
Data Import Tool (Built-In)
Section titled “Data Import Tool (Built-In)”ERPNext’s Data Import Tool provides a web-based interface for importing records from CSV or Excel files. It’s the right starting point for non-developers and for one-time master-data loads.
Importing from CSV/Excel
Section titled “Importing from CSV/Excel”Navigate to Home > Settings > Data Import or search for “Data Import” in the Awesome Bar:
- Click “New Data Import”.
- Select the DocType (e.g., “Item”, “Customer”, “Sales Invoice”).
- Select Import Type: “Insert New Records” or “Update Existing Records”.
- Download the template (CSV or Excel).
- Fill in the data.
- Upload the completed file.
- Click “Start Import”.
Download Template and Fill
Section titled “Download Template and Fill”The template includes all fields for the DocType, with mandatory fields highlighted:
Item Code,Item Name,Item Group,Stock UOM,Valuation RateVANILLA-100ML,Vanilla Scoop 100ml,Ice Cream,Nos,30CHOCO-100ML,Chocolate Scoop 100ml,Ice Cream,Nos,32MANGO-100ML,Mango Scoop 100ml,Ice Cream,Nos,35WAFFLE-CONE,Waffle Cone,Packaging,Nos,5Handling Link Fields
Section titled “Handling Link Fields”Link fields (foreign keys) require the exact name value of the linked record.
If the linked record does not exist, the import fails for that row:
Customer Name,Customer Type,Customer Group,TerritoryOutlet North Walk-In,Individual,Retail,North ZoneMetro Mall Food Court,Company,B2B,South ZoneImporting Child Tables
Section titled “Importing Child Tables”For DocTypes with child tables (like Sales Invoice with items), the template uses one row per child row. Parent fields are repeated or left blank after the first row:
Customer,Posting Date,Item Code (Items),Qty (Items),Rate (Items)Walk-In Customer,2025-06-01,VANILLA-100ML,2,60,,CHOCO-100ML,1,65,,WAFFLE-CONE,2,15Walk-In Customer,2025-06-01,MANGO-100ML,3,70The blank Customer/Posting Date cells indicate continuation of the previous parent row.
Update Existing Records
Section titled “Update Existing Records”Set Import Type to “Update Existing Records” and include the name (ID) column:
ID,Valuation Rate,DescriptionVANILLA-100ML,35,Premium vanilla bean ice creamCHOCO-100ML,38,Rich Belgian chocolate ice creamError Handling and Logs
Section titled “Error Handling and Logs”The Data Import tool provides:
- Row-by-row status: Success or failure per row.
- Error messages: Specific validation errors (missing mandatory fields, invalid Link values, duplicate names).
- Partial import: Successfully imported rows are committed even if other rows fail.
- Import log: Accessible via the Data Import record’s timeline.
API-Based Bulk Import
Section titled “API-Based Bulk Import”For programmatic imports, especially from legacy systems, use Frappe’s Python API. This is where a Node.js developer feels at home: it’s just code, looping over records and creating documents.
Using frappe.get_doc().insert()
Section titled “Using frappe.get_doc().insert()”The document API runs every validation, hook, and naming rule — exactly as if a user had saved the record in the Desk. That safety has a cost, so commit in batches and roll back individual failures rather than aborting the whole run.
import frappeimport csv
def import_items_from_csv(file_path): """Import items from a CSV file using the document API.""" success_count = 0 error_rows = []
with open(file_path, "r") as f: reader = csv.DictReader(f) for i, row in enumerate(reader, start=2): # start=2 accounts for header try: item = frappe.get_doc({ "doctype": "Item", "item_code": row["item_code"], "item_name": row["item_name"], "item_group": row["item_group"], "stock_uom": row.get("uom", "Nos"), "valuation_rate": float(row.get("valuation_rate", 0)), "is_stock_item": 1, "has_batch_no": 1 if row.get("has_batch") == "Yes" else 0, "has_expiry_date": 1 if row.get("has_expiry") == "Yes" else 0, }) item.insert(ignore_permissions=True) success_count += 1
# Commit every 100 records to avoid holding long transactions if success_count % 100 == 0: frappe.db.commit()
except Exception as e: error_rows.append({"row": i, "item_code": row.get("item_code"), "error": str(e)}) frappe.db.rollback() # rollback this single failed record frappe.clear_messages()
frappe.db.commit() return {"success": success_count, "errors": error_rows}Renaming Documents in Bulk
Section titled “Renaming Documents in Bulk”frappe.rename_doc updates the primary key and every link that references it,
so it’s the only safe way to rename a document — never UPDATE the name column
directly.
import frappe
def standardize_item_codes(prefix="SJ"): """Rename items to follow a standard naming convention.""" items = frappe.get_all("Item", filters={"item_code": ["not like", f"{prefix}-%"]}, fields=["name", "item_code", "item_name"] )
for item in items: new_name = f"{prefix}-{item.item_code}" try: frappe.rename_doc("Item", item.name, new_name, merge=False) except Exception as e: frappe.log_error(f"Failed to rename {item.name}: {e}")
frappe.db.commit()Bulk Operations with frappe.enqueue()
Section titled “Bulk Operations with frappe.enqueue()”For large datasets, run imports as background jobs to avoid request timeouts — the same reason you’d push heavy work onto a queue worker in Node.js instead of blocking the HTTP handler.
import frappe
@frappe.whitelist()def enqueue_bulk_import(file_url): """Enqueue a bulk import job for background processing.""" frappe.enqueue( "scoopjoy.utils.migration.run_legacy_import", queue="long", timeout=3600, file_url=file_url, job_name="legacy_pos_import" ) return {"message": "Import job queued. Check Background Jobs for status."}
def run_legacy_import(file_url): """Background job: import historical POS invoices from legacy system.""" import json
file_doc = frappe.get_doc("File", {"file_url": file_url}) file_path = file_doc.get_full_path()
with open(file_path, "r") as f: records = json.load(f)
total = len(records) for i, record in enumerate(records): try: create_pos_invoice_from_legacy(record)
if (i + 1) % 100 == 0: frappe.db.commit() frappe.publish_progress( percent=(i + 1) / total * 100, title="Legacy POS Import", description=f"Processed {i + 1} of {total}" )
except Exception as e: frappe.log_error( message=f"Row {i}: {str(e)}\nData: {json.dumps(record)}", title="Legacy POS Import Error" )
frappe.db.commit() frappe.publish_realtime("legacy_import_complete", {"total": total})
def create_pos_invoice_from_legacy(record): """Convert a legacy POS record into an ERPNext POS Invoice.""" si = frappe.get_doc({ "doctype": "POS Invoice", "company": record["company"], "customer": record.get("customer", "Walk-In Customer"), "posting_date": record["date"], "posting_time": record.get("time", "12:00:00"), "is_pos": 1, "pos_profile": record.get("pos_profile"), "items": [{ "item_code": item["code"], "qty": item["qty"], "rate": item["rate"], "warehouse": record.get("warehouse") } for item in record["items"]], "payments": [{ "mode_of_payment": record.get("payment_mode", "Cash"), "amount": record["total"] }] }) si.insert(ignore_permissions=True)
# Only submit if explicitly requested (historical data may need review) if record.get("auto_submit"): si.submit()frappe.publish_progress and frappe.publish_realtime push updates over
Socket.IO, so the user can watch a long import advance in the browser instead of
staring at a spinner.
Database Direct Operations
Section titled “Database Direct Operations”For maximum speed on very large datasets, bypass the document API and use direct database operations.
frappe.db.sql() for Bulk Updates
Section titled “frappe.db.sql() for Bulk Updates”import frappe
def bulk_update_item_prices(item_group, percentage_increase): """Increase valuation rate for all items in a group.""" frappe.db.sql(""" UPDATE `tabItem` SET valuation_rate = valuation_rate * (1 + %(pct)s / 100), modified = NOW(), modified_by = %(user)s WHERE item_group = %(group)s """, { "pct": percentage_increase, "group": item_group, "user": frappe.session.user }) frappe.db.commit()frappe.db.bulk_insert() for Mass Inserts
Section titled “frappe.db.bulk_insert() for Mass Inserts”import frappefrom frappe.utils import now, getdate
def bulk_insert_batch_records(items_with_batches): """Insert batch records using bulk_insert for maximum speed.""" fields = [ "name", "batch_id", "item", "expiry_date", "manufacturing_date", "batch_qty", "owner", "creation", "modified", "modified_by", "docstatus" ]
values = [] timestamp = now() user = frappe.session.user
for item in items_with_batches: batch_name = f"BATCH-{item['item_code']}-{item['batch_id']}" values.append([ batch_name, # name (must set manually) item["batch_id"], # batch_id item["item_code"], # item item.get("expiry_date"), # expiry_date item.get("mfg_date"), # manufacturing_date item.get("qty", 0), # batch_qty user, # owner timestamp, # creation timestamp, # modified user, # modified_by 0 # docstatus ])
# bulk_insert handles chunking internally (default chunk_size=10_000) frappe.db.bulk_insert("Batch", fields=fields, values=values, ignore_duplicates=True) frappe.db.commit()Key notes on bulk_insert:
- Does not trigger autoname — you must set
doc.nameexplicitly. - Does not set system fields — manually provide
creation,modified,owner,modified_by. - Does not run hooks — no
before_insert,after_insert, etc. - Chunks automatically — default
chunk_size=10_000prevents memory issues. ignore_duplicates=Trueskips rows with duplicate primary keys.
Migration from Legacy Systems
Section titled “Migration from Legacy Systems”Bringing a live business onto ERPNext means seeding opening balances so the books and stock ledgers start from the right numbers.
Opening Stock Entry
Section titled “Opening Stock Entry”After importing your Item master, enter opening stock for all warehouses:
def create_opening_stock_entry(company, warehouse, items_with_qty): """Create a stock entry for opening balances.""" se = frappe.get_doc({ "doctype": "Stock Entry", "stock_entry_type": "Material Receipt", "company": company, "posting_date": "2025-04-01", # financial year start "items": [{ "item_code": item["item_code"], "qty": item["qty"], "basic_rate": item["valuation_rate"], "t_warehouse": warehouse, "batch_no": item.get("batch_no"), "cost_center": frappe.db.get_value("Company", company, "cost_center") } for item in items_with_qty] }) se.insert(ignore_permissions=True) se.submit() return se.nameOpening Account Balances
Section titled “Opening Account Balances”Opening entries must balance, so any net difference is parked against the Temporary Opening account until the rest of the books are reconciled:
def create_opening_journal_entry(company, opening_balances): """Create opening balance journal entries for accounts.""" accounts = [] net_amount = 0
for bal in opening_balances: accounts.append({ "account": bal["account"], "debit_in_account_currency": bal.get("debit", 0), "credit_in_account_currency": bal.get("credit", 0), "party_type": bal.get("party_type"), "party": bal.get("party"), "cost_center": frappe.db.get_value("Company", company, "cost_center"), "is_opening": "Yes" }) net_amount += bal.get("debit", 0) - bal.get("credit", 0)
# Add Temporary Opening account to balance if net_amount != 0: temp_account = frappe.db.get_value("Account", {"account_name": "Temporary Opening", "company": company}, "name") accounts.append({ "account": temp_account, "debit_in_account_currency": max(-net_amount, 0), "credit_in_account_currency": max(net_amount, 0), "is_opening": "Yes" })
jv = frappe.get_doc({ "doctype": "Journal Entry", "voucher_type": "Opening Entry", "company": company, "posting_date": "2025-04-01", "is_opening": "Yes", "accounts": accounts }) jv.insert(ignore_permissions=True) jv.submit() return jv.nameCustomer/Supplier Master Migration
Section titled “Customer/Supplier Master Migration”Legacy data is messy, so guard against duplicates with frappe.db.exists and map
free-text categories onto real ERPNext masters as you go:
def migrate_customers_from_legacy(legacy_records): """Migrate customer data from legacy system to ERPNext.""" for record in legacy_records: # Check if customer already exists if frappe.db.exists("Customer", {"customer_name": record["name"]}): continue
customer = frappe.get_doc({ "doctype": "Customer", "customer_name": record["name"], "customer_type": record.get("type", "Individual"), "customer_group": map_customer_group(record.get("category")), "territory": map_territory(record.get("city")), "mobile_no": record.get("phone"), "email_id": record.get("email"), "customer_primary_address": None # create separately }) customer.insert(ignore_permissions=True)
# Create address if provided if record.get("address"): address = frappe.get_doc({ "doctype": "Address", "address_title": record["name"], "address_line1": record["address"].get("line1", ""), "city": record["address"].get("city", ""), "state": record["address"].get("state", ""), "pincode": record["address"].get("pincode", ""), "country": "India", "links": [{ "link_doctype": "Customer", "link_name": customer.name }] }) address.insert(ignore_permissions=True)
frappe.db.commit()bench migrate: Running Patches After App Updates
Section titled “bench migrate: Running Patches After App Updates”When you update an app or install a new version, bench migrate runs two critical
processes:
- Schema sync: Updates database tables to match DocType JSON definitions.
- Patch execution: Runs pending patches listed in
patches.txt.
# Standard migration after app updatebench --site mysite.local migrate
# Migration with verbose outputbench --site mysite.local migrate --verbose
# Migrate all sites after a bench-wide updatebench --site all migrate
# Rebuild assets after migration (if frontend changes)bench buildWriting Migration Patches
Section titled “Writing Migration Patches”Patches are one-time scripts that run during bench migrate. They are tracked in
tabPatch Log and never re-executed — the closest Frappe analogue to a one-shot
SQL migration file, except the framework owns the “has this run?” bookkeeping.
patches.txt Structure
Section titled “patches.txt Structure”The two sections matter: pre_model_sync patches run before DocType schemas
are pushed to the database (while old columns still exist), and post_model_sync
patches run after (when new columns are in place).
[pre_model_sync]# Patches here run BEFORE DocType schemas are synced to database# Use when: renaming fields, moving data before a field is removedscoopjoy.patches.v1_1.rename_outlet_fieldscoopjoy.patches.v1_2.migrate_legacy_franchise_data
[post_model_sync]# Patches here run AFTER DocType schemas are synced# Use when: populating new fields, computing default valuesscoopjoy.patches.v1_1.populate_franchise_codescoopjoy.patches.v1_2.compute_royalty_defaultsscoopjoy.patches.v1_3.cleanup_duplicate_outlets
# One-off statements (simple enough to not need a file)execute:frappe.db.set_single_value("Franchise Settings", "royalty_percentage", 5)Patch File Structure
Section titled “Patch File Structure”A patch is just a module exposing an execute() function. This one runs
post_model_sync, so the new franchise_code column already exists when it
back-fills values:
import frappe
def execute(): """Populate franchise_code for existing Franchise Outlet records.
This patch runs after the franchise_code field is added to the DocType (post_model_sync), so the column already exists in the database. """ outlets = frappe.get_all("Franchise Outlet", filters={"franchise_code": ["in", ["", None]]}, fields=["name", "outlet_name", "city"] )
for outlet in outlets: code = generate_franchise_code(outlet.outlet_name, outlet.city) frappe.db.set_value("Franchise Outlet", outlet.name, "franchise_code", code, update_modified=False)
frappe.db.commit()
def generate_franchise_code(name, city): """Generate a unique franchise code from outlet name and city.""" prefix = "".join(word[0] for word in name.split()[:3]).upper() city_code = city[:3].upper() if city else "XXX" return f"FR-{prefix}-{city_code}"Pre-Model-Sync Patch Example
Section titled “Pre-Model-Sync Patch Example”Renaming a field is the classic reason to use pre_model_sync: the DocType JSON
already declares the new name, so you must migrate data off the old column
before schema sync drops it. rename_field moves the data and updates
dependent metadata for you.
import frappefrom frappe.model.utils.rename_field import rename_field
def execute(): """Rename 'outlet_manager' to 'franchise_manager' before schema sync.
The DocType JSON already has the new field name. This patch runs pre_model_sync to migrate data from the old column before it is dropped. """ doctype = "Franchise Outlet"
# Check if old column still exists if frappe.db.has_column(doctype, "outlet_manager"): rename_field(doctype, "outlet_manager", "franchise_manager")
frappe.db.commit()Data Cleanup Script
Section titled “Data Cleanup Script”import frappe
def execute(): """Fix inconsistent outlet naming across the system.
Some outlets were created with trailing spaces or inconsistent capitalization. This patch standardizes all names. """ outlets = frappe.get_all("Franchise Outlet", fields=["name", "outlet_name"] )
for outlet in outlets: clean_name = " ".join(outlet.outlet_name.strip().split()) clean_name = clean_name.title()
if clean_name != outlet.outlet_name: frappe.db.set_value("Franchise Outlet", outlet.name, "outlet_name", clean_name, update_modified=False)
frappe.db.commit()