Some tips and tricks when writing academic papers

I have been teaching the course Research Methods, Tools and Issues in our MCT programme this semester. The last class was an “open clinic” in which I answered questions about academic writing. Here is a summary of some of the things I answered, which may hopefully also be useful for others.

Formatting

Your academic exam paper is not the place to experiment with fancy layout and formatting. Some basic tips:

  • Template: Choose a conservative template (but not too old-school). Check that it is in A4, as templates using a North-American “letter” paper size looks weird when/if printed in A4.
  • Font: A serif font typically looks more serious than a sans-serif, so “Times New Roman” or something similar is the safest choice.
  • Paragraphs: There are different ways of formatting paragraphs. The two most common ones are: (1) indented first lines, (2) spaces between lines. The first type is the most common in professional type-setting and is what you see in books and academic journals. It is also the most space-conservative. Making spaces between lines is what most people do when they write on computers. Choose whichever type you want, but do not mix the two.

Writing

It is impossible to cover everything about academic writing here. But these are the things I usually comment on when I supervise:

  • Long sentences: It is better to write short and meaningful sentences than long and complex ones. Many students think that their text will look more academic if they make long sentences. The truth, however, is that it is much more pleasurable to read well-formed and meaningful sentences. The general rule of thumb is to include only one point in a sentence. If in doubt, start a new sentence. It also helps to read your text out loud. Whenever you feel that
  • Short sentences: Some students only write very short sentences. This is not good either, so find a balance.

Spelling and grammar

Whatever you do, ensure that your texts are not full of spelling mistakes. You should also reduce the number of grammatical errors. There are so many tools available to help you these days, so there is no excuse for not using them before you submit your final text.

Figures

Figures are nice, please include them! But when you do, always think about this:

  • Label: Figures should always have a figure name (“Figure 1”) followed by an explanation of what the figure is about (“This figure shows…”). Ideally, the figure text should be sufficient to understand what the figure aims at conveying of information. Many people (like myself) like to browse through papers and books quickly, and use the figures as a way to quickly navigate the content. Then it helps if the figure texts are self-explanatory.
  • Text size: Figures are often made in different software than where you write. This means that you typically do not have full control of their size in the final layout, hence the text inside of the figure may be too large and too big. As part of the final layout, you should try to make the text size similar to the text size of the main document within which the figure is placed.
  • Units, labels and legends: If you include graphs or other types of representations of numbers, it is critical to include information about what the axes mean and the units that you have used (“Time (s)”, “Number of people”, “Vertical position (mm)”). You should also have clearly marked legends (if relevant) to explain what the different lines in your figure are.
  • Simplify: You should always aim to remove unnecessary stuff from figures so that the most important things are what people see. This follows Tufte’s ideal of aspiring for a high “data-ink ratio”.

References

Adding references to your text, and including a bibliography at the end of your document, is the clearest sign that you are writing an academic paper.

  • Consistency: Ensure that all citations have an entry in the bibliography. Similarly, all entries in the bibliography should be referenced in the text.
  • Reference manager: Use a reference manager to keep track of everything. While it is not perfect, I generally recommend Zotero. It works on all platforms, has an online front-end, and integrates with many writing platforms.

Submission

  • PDF: If you are not asked to do otherwise, always submit a PDF file. This will ensure that both content and layout are preserved for the final reader. Submitting your “raw files” (.docx, .pages, .odt, etc.) is problematic for a number of reasons. First, they may not be readable by people on different platforms (.pages files only work on OSX, for example). Second, often such raw files contain the history of the file, which you may not want the end reader to see. This may be particularly important if you have been using track changes.
  • Good naming: Always give your file a useful name. If your exam is anonymous, include your candidate number in the file name. If not anonymous, include your last name. Your examiner will probably download a zip-folder with all submissions. Having a bunch of files with names such as “exam.pdf”, “submission.pdf”, etc., is annoying.
  • Supplementary files: It is often fine/useful/required to submit supplementary material. Then it is usually good to have a list at the end of your main document describing what you have chosen to include (for example, a list describing audio files). If you have many supplementary files, you should zip them down and give them a useful name. Again: remember that the reader will download your submission together with a bunch of other things. It is your job to make the read as pleasurable as possible.

General form

The standard “IMRAD” form of a paper looks like this:

  • Abstract
  • Introduction
    • Motivation (could include a rhetorical question/something catchy)
    • Research question(s)
    • (Hypotheses)
    • Definitions
    • Limitations / scope
    • Overview of the paper
  • Background (either chronological or topical)
  • Methods (be precise – explain what you did, how, etc)
  • Results
  • Discussion
  • Conclusion

Many interaction papers, and also “NIME-like” papers, have a form something like:

  • Abstract
  • Introduction
  • Background
  • Method
  • Design
  • Implementation
  • Evaluation / Discussion
  • Conclusion

Batch convert RTF files to TXT

Last year I decided to use plain text files (TXT) as the main file type for all my computer text input. There are several reasons for this, but perhaps the most important one was all the problems experienced when trying to open other types of text-based files (RTF, DOC, etc.) on various iOS and Android devices that I use daily. Another reason is to become independent of specific software solutions, forcing you to use a specific software for something as basic as writing text on your computer or device. Along the way I decided to shift my note-taking from MacJournal to nvALT. The best thing about nvALT is that it can unobtrusively monitor a folder of text files, and it allows for quickly searching in old files and write new ones. Since all the files are just plain text files stored in a regular folder (and sync’ed to the cloud), I can of course also use any text editor to view and write the files.

The problem was how to get all my previous notes into my new “system”. I have used a number of different note taking software over the years (e.g. Journler, DevonThink and Evernote). Fortunately, I have been quite careful about exporting all the notes regularly, mainly as RTF files. Having a few thousand such files (and some others), I looked for a solution to quickly convert them to plain text files. There are more complex solutions for converting text files to various formats (e.g. Pandoc), but I found the easiest solution was to use the OSX command line utility textutil. This little line will convert all RTF files in a folder to TXT files:

find . -name \*.rtf -print0 | xargs -0 textutil -convert txt

It will (of course) remove any formatting, but it will preserve all the (text) content nicely.

To footnote or not

By coincidence I have had several discussions about footnotes, endnotes and different types of citation styles recently. Such discussions often end up in “religious” wars, in which researchers from different disciplines argue why “their” system is the best. I often find myself agreeing with none or everyone in such discussions, since I am working in and between several different disciplines (the arts, humanities, technology, psychology, medicine), and publish my own work in journals that use different ways of handling citations and notes.

What to cite or note?

Before discussing the different systems in more detail, it is worth remembering that there are usually two types of information that an author would like to include in the text:

  1. references to books, papers, etc. that you mention in the text.
  2. extra information that you do not feel it is necessary to keep in the main body of the text.

I will try to make a clear separation of these two cases in the following discussion.

Different systems

The Chicago Manual of Style suggests that there are two basic documentation systems: (1) notes and bibliography and (2) author-date. In my experience there is also a third main type, which I could call numbered citations. They each have different use:

  1. Author-year: the author’s name and the year of the publication is placed within parentheses or brackets in the text, and at the end of the text is a reference list, usually ordered alphabetically. This system is only meant for citations, and can easily be combined with using (foot/end)notes to add extra information. The “author-year” style is widespread in a number of disciplines, and is also widely used in many musicological disciplines but music history (I am here thinking of musicology in the European tradition, i.e., a heterogenous group of disciplines focusing on the study of music).
  2. Notes and bibliography: in this system both citations and extra information is put into either footnotes or endnotes. The system is used differently, dependent on the journal or publisher. Sometimes an author-year type of citation is put in the note and a full reference list is included at the end of the text. Other times the full reference is included in the note, without the need for a reference list. I have come across a number of different solutions of how to implement these two (and combinatory) methods of approaching citations in the “notes and bibliography” system. The “notes and bibliography” style is widespread in parts of the humanities, particularly that of historical disciplines (including music history). The main difference between the “notes and bibliography” system and the two others (“author-year” and “numbered citations”) is that it allows for mixing citations and other types of information in the notes.
  3. Numbered citations: This may at first glance seem like a system quite similar to using endnotes with “notes and bibliography”, but in fact is quite different. The numbered citation system does not allow for mixing in other information, it is a purely citation-based system in which the numbering used in the citations in the text refer to numbers in the reference list, either in order of appearance in the text or alphabetically. This style I often encounter in more technology-oriented publications, as well as in some medical and psychological journals. Sometimes you even find a combination of the “author-year” and the “numbered citations” systems, with abbreviated citation keys, e.g. (Jen07) instead of (Jensenius, 2007).

I guess there might be some researchers that only work with one of these systems throughout their entire career, but I usually have to adapt to any of these systems dependent on where I want to publish. Coming to think of it: I have just proofread the camera-ready versions of three journal papers that will be published in the coming months (more on that later), each of which is using one of the three systems mentioned above.

Since I use all three systems regularly, I have worked out writing and formatting techniques that work well with all of them (thanks to LaTeX and BibTeX), and have no problems adapting to whatever the publisher wants. That said, I have throughout the years made up a clear opinion of what I prefer myself: the “author-year” method. This opinion is solely based on what I think is the most efficient method for reading and writing texts. In the following I will try to explain the rationale behind this decision.

Why I like author-year for citations

My main argument for using the author-year style is based on efficiency of writing and reading. More precisely I will argue that the author-year system is:

  • Compact: The author-year system makes it possible to create compact texts, since the citations only take up a small space on a line (at least if the names are not too long). As such, it is more compact than putting citations (or even full references) on separate lines in footnotes or endnotes. The “author-year” system is less compact than the “numbered citations” system, in which the citation is only a number, but this is also what makes the “author-year” system more readable.
  • Readable: The author-year system makes it possible to read the text continuously, since the citations are placed inline with the text. I do agree that it may be more of a distraction to look up citations in a reference list at the back of the paper than to look in a footnote. However, if you know the field fairly well, or read through the reference list before reading the paper, it is possible to understand who is being referenced by only reading the main body of the text. The disadvantage of having to look up references in footnotes, is that it distracts from the reading — you have to constantly shift focus up and down the page to find the note and then find back to where you left off. I have not found any tests on the speed of reading with different systems, but my own feeling is that it dramatically reduces the speed at which I read when I have to constantly move up and down the page between the content and footnotes.
  • Easier: I am doing a fair bit of manuscript reviewing, and lots of supervision of student papers and theses, and am highly convinced that the “author-year” system is easier to handle for most writers. From my own and lots of my students’ experience, working with (foot/end)notes is a pain in most WYSIWYG programs (e.g. MS Word). I have seen countless of examples of how all the footnotes in long master theses documents have been scrambled, renumbered, reformatted, etc. I have had no such technical challenges in LaTeX, but it is still a much easier writing and layout-process of just including everything in the main body of the text.

Why I try to avoid (foot/end)notes

Some arguments for footnotes are that they allow for:

  • Quick access: the information is there at the bottom of the page.
  • Elaboration: allows you to further develop the arguments without distracting the main narrative.

I have for a long time been very fascinated with the concept of hypertext, and the possibilities that non-linear writing opens for. This in itself should be a good argument for me liking footnotes. The problem, however, is that foonotes, at least in the traditional sense, are very far from the ideas of hypertext. First, footnotes often seem to be used to dump content that the author did not feel was necessary/relevant/interesting enough to include in the main text. Second, the footnote is a dead-end, from which the only way out is to go back to where you came from. As such, footnotes do not open for the concept of hypertext as an interwoven web of texts (yes, it sounds a bit silly to write this in 2012, but despite the progress of the www, hypertext as a concept and method is still just in its infancy).

There are certainly cases when you are uncertain as to whether a part of your text should be included or not, particularly when beginning to write a manuscript. That is also one of the reasons why I often use footnotes myself as a writing method, moving content back and forth between the main text and the footnotes. Writing is always based on decision-making: what should I include and what should I leave out? The problem with footnotes is that they can be used as an excuse of not getting rid of content that is not really necessary, as this quote summarises well:

But think whether such information needs to be present at all. If the term being footnoted in the first of these examples is so obscure, why not merely explain it? […] You should make every effort to make your work a pleasure to read. Reading it should not be an epic struggle on the part of your hapless reader.

That is the reason why I usually end up either throwing away the footnotes, or including them in the text as I finalise my manuscripts.

(Foot/end)notes in electronic documents

The last reason I am sceptical about footnotes, is the move towards electronic documents. While footnotes may make sense in a printed document, they usually end up as endnotes in electronic documents (where there is no static concept of “pages”). For that reason, it may be easier to just work with endnotes in the first place, since the document can be more easily used in both printed and electronic format.

Working with static, dead-end endnotes in electronic documents is not very future-optimistic, though. Then I would rather hope that we could work towards proper hypertexts, in which multiple layers/levels of text could be intertwined. Until that is possible, and accepted in scientific writing, I prefer to write and read linear texts without (foot/end)notes. I believe that is easier both for the author and for the reader.

Application writing as example of stretchtext

I have been working on an ERC Starting Grant application over the last months. Besides the usual conceptual/practical challenges of writing funding applications, this particular application also posed the challenge of writing not only one proposal document, but two: one long (15 pages) and one short (5 pages). I am used to writing research papers and applications where you are dealing with three levels:

  • title
  • abstract
  • content

But for the ERC application I had to handle four levels:

  • Title
  • Summary (2000 characters)
  • Synopsis (5 pages)
  • Proposal (15 pages)

While working on the application, I started thinking about my old fascination of hypertext theory. One concept I found (and still find) interesting here is Ted Nelson’s idea of stretchtext. Stretchtext can be seen as text that can literally be “stretched” to any desired length (see, for example, this example). Conceptually this makes sense. After all, we as humans are able to do such stretching fairly easily, always trying to maximize our content to the limitations we may have. For example, I have no problems talking about my current research project for 1 minute, 5 minutes, 20 minutes or 45 minutes, it is just about “interpolating” the content over the required timeframe. The challenge, of course, is to balance the content in such a way that it makes sense for different durations or number of pages.

But how do you go about when having to write 5 and 15 pages about the same thing. Should you write 15 pages first, and then cut it down to 5? Or is it better to start with 5 pages and then “interpolate” it to 15? My approach this time was not particularly structured, and I constantly found myself moving back and forth between the two documents. This was perhaps not the most ideal solution, since I often found myself making the same changes twice.

The strategy I ended up with, and that I would probably start out with If I were to do such a thing again, was to use the commenting function of LaTeX more actively. In regular word processing software (MS Word, OpenOffice, etc.) there is no easy way to include or remove content from the document easily. The text in the document is there, and if you remove it, it is gone. In LaTeX it is possible to comment out blocks of text by just typing the % sign in front of the line. This makes it easy to “turn off” whole blocks of text. As such, my final 5 page synopsis document contained more or less the same stuff as the full 15 page document, but with large parts of the text commented out.

It would have been nice if LaTeX had had the opportunity to define levels of text. Then I could have chosen to write only one document, and defined which parts should be at level 1, 2, 3, etc. This could then have been used to output the different levels more or less automatically. Such an approach could perhaps be done done with a text outliner (e.g., OmniOutliner), and I am curious to test this out at some point.

However, the biggest challenge of writing a stretchtext is probably not the software being used. It is rather to figure out what content to include, and make it work linguistically at the different levels. In the end, you might end up with writing two separate documents after all…