Today I need to get a list of worksheets from a very large spreadsheet (~100 worksheets, many thousands of cells on each) quickly from the command line. My ruby script tried to parse the entire workbook and exploded all my RAM. Then I discovered grep -0 which prints matching occurrences only (each on a separate line) which is very useful with regular expression patterns. The “sheet name” entries are in the “workbook.xml” zipped up in the Excel file.
$ unzip -c your-large-spreadsheet.xlsx xl/workbook.xml | grep -o sheet\ name=\"[^\"]*\" | cut -d = -f 2
Turns out that the Excel VBA function UsedRange doesn’t always give the expected results. It gives you the range of cells that have ever been used. If you’ve formatted them, or they used to have values in that you’ve since deleted, you still get those cells. Today I wanted to iterate all cells in a lot of sheets. Some of those sheets had lots of cells in their UsedRange. I hit upon a solution I’ve not seen anywhere else: take the union of constants and formulas in a sheet. This is quick, because you can use SpecialCells to do the heavy lifting. And you can refine the results to exclude errors using SpecialCells Value argument (the default 23 below selects everything). Here’s the code
Function myUsedRange(ws As Worksheet, Optional lookfor As Long = 23) As Range
Dim c As Range, f As Range
On Error Resume Next
Set c = .SpecialCells(xlCellTypeConstants, lookfor)
Set f = .SpecialCells(xlCellTypeFormulas, lookfor)
If c Is Nothing Then Set c = .Cells
If f Is Nothing Then Set f = .Cells
Set myUsedRange = Union(c, f)
This returns a non-contiguous range containing only the cells you want. This version is a bit lazy and includes [A1] if there are no constants or no formulas matching your search. I didn’t mind because I was using lookfor=2 to search for text which reduced the number of cells by thousands. Hundreds of thousands in some cases. A better version would take a default as a parameter and return it only if neither constants or formulas gave any results.