/

Pandas Fast Database Insert

PythonPandasSQL

Found that df.to_sql is too slow for large inserts, use below snippet to insert hundreds of thousands of rows fast

I generally use ceODBC, but pyodbc seems just as fast

python
import pandas as pd
df = pd.read_csv(fileName)
df = df.applymap(str)
tuples = [tuple(x) for x in df.values]
def chunks(l, n):
n = max(1, n)
return [l[i:i + n] for i in range(0, len(l), n)]
#Break inserts into 1000 at a time, can be more or less depending on how many columns
new_list = chunks(tuples, 1000)
for i in (new_list):
curr.executemany(
"INSERT INTO dbo.TABLE ([ColumnA],[ColumnB],[ColumnC],[ColumnD],[Discontinued]) VALUES (?,?,?,?,?)",
new_list[i])
conn.commit()
curr.close()
Commonly reused code