C# – Get Last non empty column and row index from excel using Interop

cetlexcelssisvb.net

I am trying to remove all extra blank rows and columns from an excel file using Interop Library.

I followed this question Fastest method to remove Empty rows and Columns From Excel Files using Interop and i find it helpful.

But i have excel files that contains a small set of data but a lot of empty rows and columns (from the last non empty row (or column) to the end of the worksheet)

I tried looping over Rows and Columns but the loop is taking hours.

I am trying to get the last non-empty row and column index so i can delete the whole empty range in one line

XlWks.Range("...").EntireRow.Delete(xlShiftUp)

enter image description here

Note: i am trying to get the last row containing data to remove all extra blanks (after this row , or column)

Any suggestions?


Note: The code must be compatible with SSIS Script Task environment

Best Answer

Update 1

If your goal is to import the excel data using c#, assuming that you have identified the the highest used index in your worksheet (in the image you posted it is Col = 10 , Row = 16), you can convert the maximum used indexes to letter so it will be J16 and select only the used range using and OLEDBCommand

SELECT * FROM [Sheet1$A1:J16]

Else, i don't think it is easy to find a faster method.

You can refer to these article to convert indexes into alphabet and to connect to excel using OLEDB:


Initial Answer

As you said you started from the following question:

And you are trying to "get the last row containing data to remove all extra blanks (after this row , or column)"

So assuming that you are working with the accept answer (provided by @JohnG), so you can add some line of code to get the last used row and column

Empty Rows are stored in a list of integer rowsToDelete

You can use the following code to get the last non empty rows with an index smaller than the last empty row

List<int> NonEmptyRows = Enumerable.Range(1, rowsToDelete.Max()).ToList().Except(rowsToDelete).ToList();

And if NonEmptyRows.Max() < rowsToDelete.Max() the last non-empty row is NonEmptyRows.Max() Else it is worksheet.Rows.Count and there is no empty rows after the last used one.

The same thing can be done to get the last non empty column

The code is Edited in DeleteCols and DeleteRows functions:

    private static void DeleteRows(List<int> rowsToDelete, Microsoft.Office.Interop.Excel.Worksheet worksheet)
    {
        // the rows are sorted high to low - so index's wont shift

        List<int> NonEmptyRows = Enumerable.Range(1, rowsToDelete.Max()).ToList().Except(rowsToDelete).ToList();

        if (NonEmptyRows.Max() < rowsToDelete.Max())
        {

            // there are empty rows after the last non empty row

            Microsoft.Office.Interop.Excel.Range cell1 = worksheet.Cells[NonEmptyRows.Max() + 1,1];
            Microsoft.Office.Interop.Excel.Range cell2 = worksheet.Cells[rowsToDelete.Max(), 1];

            //Delete all empty rows after the last used row
            worksheet.Range[cell1, cell2].EntireRow.Delete(Microsoft.Office.Interop.Excel.XlDeleteShiftDirection.xlShiftUp);


        }    //else last non empty row = worksheet.Rows.Count



        foreach (int rowIndex in rowsToDelete.Where(x => x < NonEmptyRows.Max()))
        {
            worksheet.Rows[rowIndex].Delete();
        }
    }

    private static void DeleteCols(List<int> colsToDelete, Microsoft.Office.Interop.Excel.Worksheet worksheet)
    {
        // the cols are sorted high to low - so index's wont shift

        //Get non Empty Cols
        List<int> NonEmptyCols = Enumerable.Range(1, colsToDelete.Max()).ToList().Except(colsToDelete).ToList();

        if (NonEmptyCols.Max() < colsToDelete.Max())
        {

            // there are empty rows after the last non empty row

            Microsoft.Office.Interop.Excel.Range cell1 = worksheet.Cells[1,NonEmptyCols.Max() + 1];
            Microsoft.Office.Interop.Excel.Range cell2 = worksheet.Cells[1,NonEmptyCols.Max()];

            //Delete all empty rows after the last used row
            worksheet.Range[cell1, cell2].EntireColumn.Delete(Microsoft.Office.Interop.Excel.XlDeleteShiftDirection.xlShiftToLeft);


        }            //else last non empty column = worksheet.Columns.Count

        foreach (int colIndex in colsToDelete.Where(x => x < NonEmptyCols.Max()))
        {
            worksheet.Columns[colIndex].Delete();
        }
    }