# Word Up

Notes on the challenges I faced recreating existing Word reports in Rmarkdown with Officer, Officedown and flextable.

Depending on your age, the song title ‘Word Up’ will either remind you of a bloke in a red codpiece, a Scottish rock band with a left field cover version, or possibly, Little Mix, (according to Google).

However, I urge you to put down your codpieces (yes, even you at the back) and instead gather round for tales of woe working with Microsoft Word.

Here’s the background: I’m currently working on a project where I am using parameterised rmarkdown for the first time - hooray! This will end up as 13 separate documents of about 30 pages, with lots of tables and charts. This is the first phase of many similar documents.

But, I have to render my final output to Word, which, I’m not going to lie, was a bit of a worry.
That’s a lot of Word, and a lot of scope for things to go wrong.

While I’m not adverse to Excel, I am far from a fan of Word.
Much of this is because of its infuriating layout quirks when it comes to margins,figures and tables.

The {officer}, {officedown} and {flextable} packages make working with Word easier.
You can use {officer} to create Word and Powerpoint documents when working in R, {officedown} to do the same in R Markdown, and {flextable} for, well, tables.

The advantage of {officedown} is it allows dynamic figure and table captions, cross referencing, and whole host of other features ( like multi column layouts, changing page orientation, and different formats for different sections) {officer} is a lower level package, where you work with adding paragraphs, and other blocks. One of my favourite features is block_pour_docx which allows you to insert an entire page, or even an entire document, into your rmarkdown file, as a one-liner.

{flextable} is not just for Word, and ranks alongside {gt} as a general purpose solution for all your table needs. If you haven’t considered it before now, it is definitely worth spending some time on.

There’s not a lot of documentation / help/ blog posts out there for working with these packages, especially for Word.
Or, at least, not in comparison to working with Rmarkdown generally.

Alison Hill wrote a post detailing her friction log when working with Powerpoint, so consider this my friction log of working with Word.

(These are in no particular order)

## Applying your departmental / organisational styles

You will see advice that says you need to knit a regular Word document ( from the New File menu item in RStudio), then open it (it doesn’t need any content) and then amend the styles in this document, before passing it to officer as a reference document.

My colleagues already had a .dot template with our departmental styles applied. I found that I could use that (after I’d saved a copy as a .docx file) and the styles were picked up by {officedown}. No need to generate a blank file and amend everything from scratch (there are a lot of style elements and it would have been really tedious). If you already have a document with styles in place, you should be good to go. Here’s how my Rmd file looked. You specify rdocx_document as the output format, and I saved my template file in the root directory so it was easy to reference :

output:
officedown::rdocx_document:
reference_docx: "my_report_template.docx"



## Adding a row to a table to create space / improve presentation

Throughout this project I’m recreating an existing suite of tables and plots that have, until now, been produced in Excel & Word. The first table had totals at Scotland, Council and Health Board level, then a space, then a range of lower level data.

Once I’d got the data into shape, I first added the space using dplyr::add_row().

But, over time, this felt wrong - I didn’t like the idea of adding an unnecessary empty row to a dataframe, especially if I might need it for further analysis later on.

The solution, instead, was to add padding to the relevant row(s)

flextable::flextable(df) %>%



Here I’m adding padding to rows 1 and 9 of my table ( flextable uses the i and j conventions for applying functions and modifications to rows and columns), and ensuring the padding goes at the bottom by setting padding.top to FALSE.

## Setting row heights

Another way to improve presentation in some of my tables was to reduce row heights for some rows, and increase it for others.

out_table  %>%
flextable::height(., i = c(4), height = .015, part = "body") %>%
flextable::height(., i = c(1, 2, 3, 5, 6, 7, 8, 9, 10, 12, 13), height = .25,
part = "body")



Row 4 gets less space, while the other row heights are increased. N.B - I thought I could do something like

flextable::height(., i = c(1:3, 5:10, 12:13), height = .25,
part = "body")


but it didn’t seem to like it, so each row got entered individually instead.

## Amending number formats for specific columns

Here I amend the 3rd, 4th and 5th columns - because the 3rd column name is dynamic, I didn’t want to refer to it by name

flextable::flextable(table_data, cwidth = 1.05) %>%
flextable::colformat_double(.,j = c(3, 4, 5), digits = 1) %>%



For other tables, where I know the column names are static, I can enter the names directly:

flextable::flextable(table_data, cwidth = 1.05) %>%
flextable::colformat_double(.,j = c("Rate","Lower bound","Upper bound"), digits = 1)



I needed to add header row above the final two columns in my seven column table, denoting whether there was a statistically significant difference between them and a reference value.

flextable::add_header_row(.,values = c("","Significance"), colwidths = c(5,2))


The first 5 columns get passed nothing, while “Significance” spans the final two columns.

## Highlighting a row

Using bg, we can add a custom background colour to highlight a specific row.

I found that when creating the table in a function, I had to to use the which syntax below:


flextable::bg(out_table,
i = which(out_table$body$dataset$areaname == mylocation), bg = "wheat", part = "body")  Prior to this, I was using the syntax below, but it would not work inside a function.  i = ~ areaname == eval(mylocation),  ## Remove / blank out a column heading Easy enough, give the column name you want to amend and pass it "" out_table %>% flextable::set_header_labels(., areaname = "")  ## superscript in the chunks Ok, technically not an {officer} / {officedown} thing, but here it is anyway. You’ll see I’m specifying the figure captions here, also  {r fig6, fig.height = 4, fig.width = 6.5, fig.cap= "Annual birth rate per 1,000 population^1^, 2002 - 2020"} {r fig8, fig.height = 3.5, fig.width = 6.5, fig.cap= "Age-sex standardised all age death rate per 100,000 population^1^, 2002 - 2020^2^,^3^", fig.cap.style = "Image Caption"}  ## ensuring a table returned even if no data exists If this sounds crazy / stupid, then here’s the scenario. This particular table looks at deprivation. Not all areas have datazones within a certain deprivation decile, hence, there will not always be data to show. However, I want the table numbers to be consistent across all the documents - table 10 in doc1 should address the same topic as table 10 in doc 2 (instead of showing table 11 data under a table10 heading). I went belt and braces here, using tibble::tribble() and tibble::add_row()  if (my_nrows < 1) { out_data <- tibble::tribble(~Data Zone, ~Name, ~Rank in Scotland (1 = most deprived area 6976 = least deprived, ~ National decile of deprivation, NA_character_,NA_character_,NA_character_,NA_character_) outdata <- out_data %>% tibble::add_row(., tibble::tibble_row()) }  This returns an empty table for the edge case where currently no zones are in the specified decile. Here’s a strange one. When knitting docs individually, the resulting Word file opened and the table of contents was displayed. But, when rendering all of the docs in a loop using purrr, there was no TOC, and instead there was the following warning: I was asked if I’d like to update external links within the document. If I said ‘no’ , then the TOC did not update. If I said ‘yes’, then it would update, and appear as intended. I’m not sure why I only saw this when rendering with rmarkdown::render(), but there doesn’t appear to be a way round it. This means, when the final documents are distributed, we will have to advise readers to accept any prompts to update external links, which always feels a bit uncomfortable. Or, we have to ensure that we ourselves open and refresh all the documents before onward distribution. More on this here ## The great & / and debacle TLDR - lots of my variables had and within their names. In some datasets, this was changed to the ampersand. For example Skye, Lochalsh and Wester Ross became Skye, Lochalsh & Wester Ross I need to unify these, and let’s face it, typing “&” is easier, and takes up less space. Big mistake. All seemed to be going well, until I tried generating a report with Skye, Lochalsh & Wester Ross as the parameter. I’m sure someone is laughing at this, thinking, “well of course you can’t have “&”, that’s a special character in html / xml”, but I didn’t know that, being as I’m not an actual super nerd. (Well, OK, I am, but even I have my limits). Debugging this took me a long time. My first thought was I’d messed up my functions, and then I got some great advice on Twitter, and realised that the markdown file was rendering 100% completely and the failure point was between the .md file and the final Word output. Reverting all ampersands back to and ( which mean re-running the entire pipeline from scratch) allowed me to generate the elusive final document, and a lesson learned. I like ‘&’. I like ‘and’. But which is better? There’s only one way to find out. FIGHHTTTT! ## Best layout for table widths Depends a lot on your margins, but instead of relying on autofit(), this seems to work well :  flextable::width(., width = dim(.)$widths * 6.5 / (flextable::flextable_dim(.)\$widths))



## Inserting cover images

I’m going to leave that for the next post, which I hope will be published within the next few days..so please look out for the follow up post!