How to Do Bulk Insertion From CSV File to SQL Server?

Bulk Insertion From CSV File to SQL Server File

Huge data has become an integral part of almost every organization. Companies are using Big Data techniques to better understand their customers today. Companies analyze the customer data and identify their customer’s behavior and preferences. Many techniques are used to handle operations on the big data. These techniques help to handle and process a huge amount of data conveniently.

Today, we will discuss several techniques to perform an insert operation on a huge amount of data (around 5 million rows) from a CSV file to SQL Server database.

How to use Insert Operation for the Bulk Data?

There are various ways to perform this (insert) operation to transfer and insert data from a CSV file to the database, and our goal is to execute insertion from a C# application. Here, we are going to discuss below-listed techniques:

  • SQL BULK Insert Query
  • SqlBulkCopy library or BCP to insert bulk data by using VB or C# languages
  • Through SSIS or SQL Server Integration Services
  • Normal SQL command library in VB or C#
  • From here, developers may know the guidelines for bulk data insertion in the SQL Server.

1). SQL Bulk Insert Query

This is a query used to perform Bulk insert operation directly from a CSV file. The syntax of this direct query is given below:

Query Syntax: BULK INSERT dbo.TableForBulkData From ‘C:\BulkDataFile.csv’ WITH ( FIELDTERMINATOR=’,’, ROWTERMINATOR=’\n’ ) Along with this query one can also use try and catch block 0=like shown in the following example: BEGIN TRANSACTION BEGIN TRY BULK INSERT dbo.BulkDataTable FROM ‘c:\TestFiles\Bulk2.csv’ With ( FIELDTERMINATOR=’,’, ROWTERMINATOR=’\n’, ROWS_PER_BATCH=20000, TABLOCK ) COMMIT TRANSACTION END TRY ROLLBACK TRANSACTION END CATCH

Here, the batch insert query is a part of T-SQL that takes arguments like triggers and can execute on various batch sizes. So, developers can easily execute the query on various sizes of batch data.

2). SqlBulkCopy library or BCP to insert bulk data by using VB or C# languages

There is an SQL BulkCopyclass in the System.Data. The sqlclient namespace of .Net that can be used to copy large data amount of data from any .NET application to the SQL Server database efficiently. You can find a detailed explanation of this operation in MSDN of Microsoft.

The SqlBulkCopy class can copy a huge amount of data. It can transfer bulk data from a data table to the database table. It also takes an argument to copy the data that is like System.Data.DataRow[],

System.Data.DataTable, System.Data.IDataReader.

If you have a large CSV file then you may have to convert it to any DataRow[], DataTable, IDataReader. There are also some of the open source libraries that can be used for such conversions. One of the converters is CSVReader that binds CSV file data by using a System.Data.IdataReader interface.

We can use following code to perform SqlBulkCopy perfectly from a CSV file to the database file:

StreamReader file = new StreamReader(bulk_data_filename); CsvReader csv = new CsvReader(file, true,’,’); SqlBulkCopy copy = new SqlBulkCopy(conn); Copy.DestinationTableName=tablename; Copy.WriteToServer(csv);

Through this standard code, one can easily perform the transfer of bulk data to a CSV file. By default, the copy operation runs in its own transactions. For any transaction Rollback or Commit, operations can also be included. Here in the next section, we have given the code of SqlBulkCopy with transactions.

SqlConnection con=new SqlConnection(connectionString); con.Open(); SqlTransaction transact=con.BeginTransaction(); try { using (StreamReader file= new StreamReader(file1)) { CsvReader csv = new CsvReader(file, true, ‘:’); SqlBulkCopy copy=new SqlBulkCopy(con, SqlBulkCopyOptions.KeepIdentity, transact); copy.DestinationsTablename=tablename; copy.WriteRoServer(csv); transact.Commit(); } } catch (Exception e) { transact.Rollback();} Finally {con.Close();} Here, developers can also add BulkCopy Timeout for the large-sized CSV files like a copy. BulkCopyTimeout=XXX;

3). Through SSIS or SQL Server Integration Services

A powerful data integration and transformation technique is also found in the SQL Server that is known as SQL Server Integration Service (SSIS). Here, SSIS is being used to transfer or move data from an external data source format into SQL Server. Here BIDS or Business Intelligent Development Studio can be used to import data from a CSV file to a database. The package file can also be put like an automatic reoccurring job. Below-listed steps are to be followed to create any SSIS file:

  • Open BIS or Business Intelligence Studio of SQL Server
  • Create a new ‘Integration Service Project’
  • Now, click on the “Data Flow” tab
  • At the “Data Flow” page drag ADO.NET and Flat File Source from the toolbox and set them up.
  • Now set the output path of Connect Flat File Source to ADO.NET Destination

Now set the output path of Connect Flat File Source to ADO.NET Destination

When you will run this SSIS package then data will be duplicated from CSV file to the SQL database.

4). Normal SQL command library in VB or C#

Conventional SQL commands are used to insert data to the database, but the command execution speed is quite slow. As compared to the above-listed three ways, this command is quite slow. It is around 10 timers slower than the above-mentioned methods. One should avoid looping CSV file row-by-row and execute SqlCommand for every row to perform an insert operation in bulk amount.

Recommended Option

After performing a comparative study of all of these methods, it is found that BULK Insert SQL and SQL BulkCopy performed far better than the rest of the two methods. You can copy a CSV file of 400MB size in just 4 minutes of time, while SSIS command takes double that is around 8 minutes to perform the same operation.

So, on the basis of such analysis, it is quite obvious that one should use BULK Insert SQL or SQLBulkCopy commands to perform the data transfer operation or insertion. SSIS can also be used for bulk data transfer or insertion, but it is not reasonable to use SQL Command to perform this operation. Moreover, these operations also depend on system configuration. One can also optimize the performance of bulk import and it can be done by:

  • By disabling the triggers
  • Ordering the data in the data file
  • Using minimal logging
  • Importing data in native format

Final Expert Opinion

As per database expert Adrian Hills, bulk loading can be made efficient by using ADO.NET and SqlBulkCopy class. In case if the data file is in XML format then you can use DataSet to load the file either manually or through DataSet.ReadXml method. Moreover, bulk loading can also be optimized by using SqlBulkCopy class. You can adopt below-listed two methods to optimize bulk loading:

Loading into a heap table

Using TableLock option for bulk load, that will get a bulk upload lock on the table

With this discussion, you must be sure now how to perform bulk insertion from a CSV file to the SQL server.  Don’t forget to learn these concepts practically for a better idea.

Leave a Reply

Your email address will not be published. Required fields are marked *