Python – Writing a pandas dataframe to a word document table via pywin32

ms-wordpandaspythonpython-3.4pywin32

I am currently working on a script that needs to write to a .docx file for presentation purposes. I use pandas to handle all my data calculations in the script. I am looking to write a pandas dataframe into a table at a bookmark in a word.docx file using PyWIN32. The dataframe consists of floats. The psuedo code is something like this.

frame = DataFrame(np.arange(28).reshape((4,7)), columns=['Text1',...'Text7'])

With pywin32 imported…

wordApp = win32.gencache.EnsureDispatch('Word.Application')
wordApp.Visible = False
doc = wordApp.Documents.Open(os.getcwd()+'\\template.docx')
rng = doc.Bookmarks("PUTTABLEHERE").Range
rng.InsertTable.here

Now i would like to create a table at this bookmark. The dimensions of the table should be dictated by the dataframe. I would also like the column titles to be the header in the Word table.

Best Answer

Basically, all you need to do is create a table in word and populate the values of each cell from the corresponding values of data frame

# data frame
df= DataFrame(np.arange(28).reshape((4,7)), columns=['Text1',...'Text7'])

wordApp = win32.gencache.EnsureDispatch('Word.Application')
wordApp.Visible = False
doc = wordApp.Documents.Open(os.getcwd()+'\\template.docx')
rng = doc.Bookmarks("PUTTABLEHERE").Range

# creating Table 
# add one more row in table at word because you want to add column names as header
Table=rng.Tables.Add(rng,NumRows=df.shape[0]+1,NumColumns=df.shape[1])

for col in range(df.shape[1]):        
    # Writing column names 
    Table.Cell(1,col+1).Range.Text=str(df.columns[col]) 
    for row in range(df.shape[0]):
        # writing each value of data frame 
        Table.Cell(row+1+1,col+1).Range.Text=str(df.iloc[row,col])  

Notice that Table.Cell(row+1+1,col+1) has been added two ones here. The reason is because Table in Microsoft Word start indexing from 1. So, both row and col has to be added 1 because data frame indexing in pandas start from 0.

Another 1 is added at row to give space for data frame columns as headers. That should do it !