*Updated 06/28/2017* – Fixed the code sample. It was simply wrong before.
I have often written code to parse CSV files. Excel supports editing CSV files so this seemed to be a good method of managing moderate-sized data sets. I have not bothered to parse the Excel file directly because that seemed to be a rather large task with minimal additional value over my existing CSV parser. A quick search on google reveals there are several libraries which can be used. I choose EPPLus because it seemed to be the best.
Rapid Prototyping
Very quickly, I realized that parsing an excel file with a tool like EPPLus is dead simple. Here is the code to loop through all the cells in the first worksheet.
var package = new ExcelPackage(new FileInfo("..\\..\\..\\sample.xlsx")); ExcelWorksheet workSheet = package.Workbook.Worksheets[1]; for (int i = workSheet.Dimension.Start.Row; i <= workSheet.Dimension.End.Row; i++) { for (int j = workSheet.Dimension.Start.Column; j <= workSheet.Dimension.End.Column; j++) { object cellValue = workSheet.Cells[i, j].Value; } }
Next Steps
Well that wasn’t nearly as interesting as I had hoped. I certainly won’t shy away from parsing excel files in the future. Rather than just end here, I took some time to build a tool which reads in an Excel file and outputs an XML file. This included setting up a project to be highly modular with loosely coupled components, which could be easily swapped out. The resulting components are:
- ExcelReader – Logic for reading from an excel file
- TableReader – Logic for reading a generic set of tabular data. This could be later used for data coming from a CSV or a SQL table.
- Tables – Table information specific to this solution.
- XmlWriter – Logic for writing tables to an Xml File.
ExcelReader is nicely isolated and can be easily swapped out for another data source.
Final Thoughts
EPPLus is a great library. It makes parsing Excel files simple.
There is another C# Library for reading Excel files known as Aspose.Cells for .NET. Try it, you can find many code samples on their documentation page.
Why would you buy something when you can get it for free?
indeed
Fair enough, but mainly because it’s way faster EPPlus or NPOI. So depending on who you are and what you are using it for it may be interesting.
I meant that is way faster than EPPlus or NPOI.
I had to use package.Workbook.Worksheets[1]; rather than package.Workbook.Worksheets[0];
And thanks for the code!
That’s because EPPlus counter-intuitively uses base-1 indexing for everything. Bad design choice for an otherwise useful library.
Thank you, it works. The epplus documentation is not detailed. So you have to look in the intellisense what you have to do to get the length and iterate the cells.
seems like this is out-dated now. I’ve seen it on dozens of blogs and forums and I am certain I’ve used this exact format in the past but now it always throws an error.
I updated the code sample. You’re right. The code was wrong. It’s fixed now.
When dealing with single worksheet documents and you don’t want POCO theres https://github.com/mganss/ExcelMapper
If you have multiple worksheets, https://github.com/ipvalverde/EPPlus.DataExtractor
Nice to not have to read manually =)
Actually https://github.com/mganss/ExcelMapper can deal with multiple sheets aswell, i just didn’t know 🙂
A detailed blog with sample code: http://wp.me/p3Osmq-A4
A detailed blog: http://sforsuresh.in/reading-excel-file-using-epplus-package/
Hello !
I tried http://www.zetexcel.com !! It gives you the opportunity to generate, modify, convert, render and print spreadsheets without using Microsoft Excel
Pingback: How do I iterate through rows in an excel table using epplus? - PhotoLens
Pingback: How do I iterate through rows in an excel table using epplus? – Row Coding