Google Sheets – File Import Dropping Leading Spaces

google sheetsgoogle-sheets-cell-formatgoogle-sheets-import-csv

I have a configuration .txt file that I'm trying to upload into a new tab in an existing Google spreadsheet. I choose File>Import and select Insert new sheet(s), Detect automatically (there is no delimiter and I want each line in the text file to be its own cell), and No for auto-conversion of text.

The file uploads for the most part as I want, and it assigns each line of the config file to its own cell in the A column (about 1300 rows). However, some lines in the text file are indented with spaces to make it easier to read and categorize items. When I open the file in notepad I see these leading spaces on some lines, but when the file is uploaded to Google sheets the spaces are lost.

The original file looks something like this:

...
# comment about this next part
header:
 var1: 2
 subHeader:
   var3: 'a'
   var4: 'b'
 var5: 'false'
 # another comment
var6: 'lorem'
...

But when I import it into sheets I get this:
(imagine each line is a separate cell in the A column)

...
# comment about this next part
header:
var1: 2
subHeader:
var3: 'a'
var4: 'b'
var5: 'false'
# another comment
var6: 'lorem'
...

Since there's no leading spaces, I can't tell which variables go under which headers/subheaders.

I have tried adjusting the delimiter in the import settings to each of the available options (auto, comma, tab, and custom), along with specifying a null custom value and a copy/pasted linebreak (not sure if that worked). All of these produced the same result, except for comma, which gave worse results that broke apart some plain text in comments.

I also tried copy and pasting straight from notepad and got the same result with leading spaces removed.

If I try manually typing data from the .txt file into Google Sheets without copy/paste, the leading spaces are preserved. It also works to copy one line at a time into a new cell, but this would be excruciatingly slow.

The only viable solution I can think of is to write a separate program that parses each line of the text file and inserts a leading period before each line, and then use spreadsheet formulas to drop the period. But this is not particularly elegant and I would prefer to do it all within Sheets (and in such a manner that I can reupload the file occasionally with updates).

Best Answer

After waiting a while and coming back to the problem, I stumbled upon a solution that's better for someone like me who knows nothing about apps script (though I'll grant that Rubén's solution did provide some useful links and probably would have solved my problem if it didn't require me learning apps script from scratch).

Basically, I imported the text fie with these settings: enter image description here

The key here is that I used a separator type of a single space. This means that any leading spaces got treated as a signal to move to the next cell in a row. After importing, I ended up with a 61 column and 1300 row sheet (luckily each line didn't have too many spaces). I inserted a new column on the far left, and entered the formula

=join(" ",B1:1)

in the cell A1. This just "un-separates" all the data that was split apart by using a space as a separator, and it includes those leading spaces that were previously lost. I then copied this formula down the whole column to combine all the rows back into single cells and selected column A and pressed Ctrl + C and Ctrl + Shift + V to remove the formulas and save all the text. After that, the original imported data was done and I deleted all the columns except A.

That's about it. At this point the data is mostly functional and it has all the leading spaces. The only problem you could run into is that now there are trailing spaces at the end of each line that was shorter than the longest one. And worse, if you had trailing spaces in your original text file there is no way to preserve them. In my case, though, where there were only leading spaces, I was able to use the formula

=arrayformula(iferror(regexextract(A:A,"^.*[^ ]")))

in cell B1. This removed all the trailing spaces, and returned null rows to being actually null. I saved it with the same trick of selecting column B and pressing Ctrl + C and Ctrl + Shift + V. At this point I deleted column A (which had the trailing spaces) and was done.

Ultimately, I think for larger projects or scenarios where many text files are being imported semi-regularly, Rubén's solution of a Google Apps script is probably better. However, this ended up not being terribly labor intensive, and more importantly it only required the used of Google Sheets formulas. Notably, my method would not work well for very large data sets, especially ones which have lots of spaces on a single line.