Brief screencast on f-score in blogs

Just because the subject came up in several contexts recently, I decided to make a screencast of me explaining the concept of f-score and applying it to some data from my corpus of company blogs. I tried to embed it in a blog post, but that caused several problems because the clip would neither fit nor scale for some reason.

Click here to view the screencast in a separate window. You can also download (right-click, save) and watch it in your favorite video player, which gives the additional luxury of being able to pause.

The three blogs I look at in the clip are Marriott on the Move, JNJ BTW and Delta Air Lines. Here’s the link to the cited article and to the presentation with the example.

And apologies for my lapse of memory towards the end (which blogs am I comparing again?), but it was a long day and organizing a conference occupies a lot of brain cells. I hope it’s still informative.

Like he said: the audience is everybody

From a recent post on JNJ BTW:

When I started JNJBTW, I thought my audience would be pretty much those who write about the business of healthcare — reporters, editors, healthcare bloggers — those folks. What I’ve found, after doing this for a year, is that the people reading this are, well, er, people. Doctors, nurses, consumers — employees and retirees — people who hate the company and people who support what we do — friends, neighbors, my father-in-law… well, you get the idea.

Now those who have been blogging for a while may think, “well, duh!?” but for me it was an important point — particularly since I’m often asked “who is your audience?” My answer, which many people scoff at, is that it is everybody — that I don’t define my audience, but that the audience defines itself.

From my recent post about style and audience design in corporate blogs:

Blogs are a part of the Internet and the Internet provides virtually anyone with near-universal access to information. This may seem like a truism, but it has significant implications. Whereas before groups of stakeholder would be targeted individually and the flow of information was highly controlled, this is no longer the case in a networked world. A careful examination of the Google-Sicko story reveals a case of audience underfitting, i.e. a company employee addressing a specific audience but effectively reaching a much broader readership (and, in this case, not with a positive result).

The problem encountered is the extreme reach and transparency of online publishing. Because we are used to addressing either individuals or select communities of people, suddenly reaching a diffuse, invisible and potentially vast audience is not always easy to handle. This is especially problematic when you talk about people who are also your readers (see the Google example).

As the author of a corporate blog, one thing to never forget is that your audience defines itself (well said, Marc!) and that you need to write accordingly. Forget all the cozy rhetoric about blogs being “personal” and “open” and so forth for a moment. The key thing to keep in mind is that the word you identifies the person(s) whom you are addressing and that words like they, users, consumers, the public etc denotes those people whom you are not addressing. You are talking to the first group and about the second group. The unique aspect of blogs is that all those people that you conceptualize as being in the second group are also in the first, since anyone can potentially be a reader of your blog. The Google-Sicko example illustrates what happens in such a case: talking about someone who is part of the discourse is generally regarded as highly antisocial. In terms of language, we split the world into three parties: ourselves and those “with us” (I/we), our discourse partners (you) and everyone else (he/she/<name>). Making your reader feel treated as a third party is a mistake you don’t want to make.

The language of business, the language of blogs

I’ve just skimmed over this interesting post by Ron Ploof about the challenges of corporate blogging.

Here’s one point in the piece that caught my attention in particular:

3. Being conversational is unnatural:

Being conversational is unnatural in business communications because we’ve been taught NOT to do it. Communication specialists are used to writing “Press Releases” and marketing web pages. The good news is that outside of work, employees are very good conversationalists, so they already know how to do it, they just need to break some of their Old Media habits. Training works very well in this area. Lastly, companies cannot forget the most important ingredient of a corporate blog — transparency. Corporate blogs are conversational and transparent, and therefore should NEVER be used to spew traditional marcom drivel.

I have been thinking about the style of blogs and corporate blogs in particular for almost two years now. The persistent chant ‘blogs are conversations’ and ‘conversational good, business-speak bad’ has a tendency to drive the professional linguist in me nuts, not because I don’t agree with these popular ideas, but because I keep wondering what exactly conversational means and why it is unequivocally regarded as ‘better’.

Now, as I am gradually approaching the completion of my thesis, I think can give a carefully weighed answer to that question.

Blogs are conversations? Partly yes, partly no

Firstly, when bloggers talk about ‘conversational’ what exactly do they mean?

Real-life conversations between human beings use many expressions that depend on the situational context to be understood. Things like that guy standing right there (so-called deictic expressions), false starts (And I was…. we didn’t go… No, Sue and I didn’t go to the meeting) and fillers (We need to… umm… discuss this in more detail) abound in face-to-face talk. Conversations also typically contains a lot of signals that serve purely to confirm and validate what your communicative partner is saying (things like yeah, okay, gotcha, right, uh-huh, nodding etc) and indicate your stance and social relationship. While conversations in TV shows, plays, novels and so forth are fast, witty and fluent, real conversations are often anything but - it’s just that we’re very good at ignoring all the noise they contain. We subconsciously filter out most of the static.

Blogs are obviously different in that blog entries are planned and not spontaneous (forget all the cutesy rhetoric associated with the word spontaneous for a moment - I use it to simply mean ‘instantly expressed’). Many bloggers, and most certainly the majority of corporate bloggers will read a post they have written thoroughly before publishing it. In the case of marketing and PR-oriented blogs and with executive blogs such as that of Jonathan Schwartz it is safe to assume that an entire team of communications professionals reads, discusses and edits posts collaboratively before they are published. There is planning and polishing involved, none of which is possible in real-time conversation.

So it’s not that aspect of blogs that makes us think of face-to-face conversations. What we associate with interpersonal communication is the interactive nature of blogs - in other words, that they enable a dialog between blogger and reader. Our reasoning goes: ‘I can respond to what someone writes in their blog, so it is basically like a conversation’. The other aspect is language; the content and style of writing that is associated with blogs. Note that point - blogs are written, not spoken language, which means that none of the ‘noise’ described above in occurs in them. Many things characteristic for spoken language never occur in blogs, especially not corporate ones.

Subjective as conversational

So apart from interactivity, what else is conversation-like about (corporate) blogs?

Have a look at this excerpt from One Louder, the blog of Microsoft staffing manager Heather Hamilton:

I’m not sure what has gotten into me other than the fact that I am happier than I have been for a VERY long time. It’s funny how sometimes things can just fall into place. The changes that I wanted to have happen at work happened without me doing much about it (other than saying “this is what I want”). I have finally started to spend some weekend time relaxing (and hanging with friends). And I am starting to believe what Eckhart Tolle says about coincidences not happening; it’s all for a reason (and with most of my life, I get the reasons for even some of the unpleasant things happening). Example: last week my manager and I were talking about me needing to travel to one of our dev centers. She recommended Ireland (oh yeah, I am totally doing that!) and I said “why don’t we have a dev center in Amsterdam? I really want to go there.” Then this week, I got an e-mail inviting me to speak at a conference in Amsterdam. How ’bout that? I’ve decided not to question what forces (if any) could be invovled with things like that happening. I’m just going to enjoy it.

In addition to business-related topics, Heather frequently writes about her personal feelings, thoughts and experiences in her blog, something that I’ve found to be typical of what I call ‘personal company blogs’. Such blogs are written by just one person, have a clearly visible reference to the blogger on the front page (name, photo) and are often part of a larger company blog hub (MSDN, in this case). In contrast to personal company blogs, team company blogs are usually about a specific product, issue or segment of the company and have several authors. I’ve found that writing about personal thoughts and feelings is less common in team blogs, largely because the topical focus of the blog tends to override personal concerns. By contrast, personal company blogs tend to be understood by their owners as diaries or journals where work-related subjects are integrated with personal thoughts.

Now, keep in mind how Heather writes and then have a look at this very interesting research on business English, conducted by Mike Nelson, an applied linguist at the University of Turku. Read Mike’s short article in the Guardian for a summary of his findings.

The kind of language used in corporate contexts (pre-blogging) is fairly strictly focused on a fixed set of topics. To quote Mike:

The world of business found in real life language is a limited one made up of business people, companies, institutions, money, business events, places of business, time, modes of communication and vocabulary concerned with technology. The language found was surprisingly positive, with very few negative words featuring at all. It was also found to be dynamic and action-orientated and non-emotive.

What Mike found via his large database of language samples from real-life business settings was that corporate language largely centers on things associated with business, namely business people, companies, institutions, money, business events, places of business, time et cetera and that these things are generally presented positively (business is about getting things done, not about being self-reflexive or critical). Finally, the subjective emotions of stakeholders aren’t really very important - private matters don’t feature into corporate discourse in any significant way.

Now compare that to how Heather writes. It’s a world of difference.

In posts marked with the ‘personal blogging’ tag, Heather writes about aspects of everyday life that we are all familiar with: buying furniture and cleaning out the garage, cheering for a sports team and experiencing a blackout. Not everything is always positive - there are ups and downs. Heather’s language can certainly be described as ‘emotive’ or ‘involved’, not because it is necessarily always highly emotional, but because it is concerned with inner processes more than with actions. All of this is obviously in stark contrast to what language in most other corporate contexts looks like.

There are a number of reasons why a ‘conversational’ style in that sense of the word is typical for both non-corporate and personal company blogs and why I expect it to have an influence on how institutions communicate, present themselves and are perceived in the future. I’ll focus on three basic pillars: audience, content and style.

Who you talk to

Blogs are a part of the Internet and the Internet provides virtually anyone with near-universal access to information. This may seem like a truism, but it has significant implications. Whereas before groups of stakeholder would be targeted individually and the flow of information was highly controlled, this is no longer the case in a networked world. A careful examination of the Google-Sicko story reveals a case of audience underfitting, i.e. a company employee addressing a specific audience but effectively reaching a much broader readership (and, in this case, not with a positive result).

The problem encountered is the extreme reach and transparency of online publishing. Because we are used to addressing either individuals or select communities of people, suddenly reaching a diffuse, invisible and potentially vast audience is not always easy to handle. This is especially problematic when you talk about people who are also your readers (see the Google example).

What you talk about

One notable aspect of Heather’s blog (and many others like it) is how openly it presents personal thoughts, experiences and feelings to readers. This is not necessarily done just for the audience. It seems that many personal company bloggers, though quite aware that their blogs are public, write partly to record their thoughts for themselves much in the same way that diarists do. The blog is a chronicle of what the blogger has thought, felt and done over time, both personally and professionally. Not every personal detail imaginable is presented, but there is no strict (and artificial) separation of personal and professional topics. Independently of how bloggers conceptualize audience, the effect of sharing personal information is that it lays the foundation for relationship-building.

Being told the subjective impressions, thoughts and emotions of another human being is almost inevitably relevant to us because we value such social information very highly. Knowing personal aspects of someone’s life brings us closer to them and establishes ties which are the foundation of any interpersonal relationship. This is especially pivotal on the Internet where all voices are detached from the individuals who use them. Social information enables us to establish a relationship with someone whom we have never met, because what we know about someone allows us to draw an increasingly complete picture of what kind of person they are.

Social information as a universal currency is especially valuable in a globalized and networked world, because exchanging it builds trust and without trust the foundation for other interactions is lacking.

How you say it

There is a persistent belief that jargon, technical language and other forms of special purpose lingo exist purely to irritate those of us who don’t understand it. That’s not true quite true though - medical language or legalese may have that effect on people who aren’t doctors or lawyers, but among those who speak  them these varieties are readily understood and used for plausible reasons. Jargon allows us to

  • delineate membership in an expert community (techies, lawyers, bloggers…)
  • describe aspects of our work/community/culture/shared experience with more perceived precision than ’standard’ language allows

In other words, we often feel that what we want to say is said more effectively when we use a specialized vocabulary developed to express it. While this is unproblematic as long as we are talking to others who share our knowledge, this instantly turns into an issue when we address a broader audience - which is inevitably the case with a blog. All of a sudden, use of a specialized terminology makes us aloof, arrogant and out of touch. Audience underfitting once again leads to problems, this time in stylistic terms.

Finally, ‘conversational’ in stylistic terms also implies the use of colloquialisms, figures of speech and other expressive elements which are typically found in spoken conversation. The effect of such devices is again that they allow blogger and audience to conceptualize the blog as a speech situation, amplifying feelings of solidarity and familiarity.

What ‘conversational’ can mean

To summarize, ‘conversational’ can mean a range of things when applied to blogs. Among them are:

  • interactivity - it can describe the dialogic structure of blogs and the possibility to respond to contributions
  • speaker and audience - it can describe the discourse situation that the blog creates on a technical level and the resulting possibility for the blogger to refer to himself/herself (”I”) and address his/her readers (”you”)
  • content - it can describe a focus on personal and everyday topics which are familiar to a broad audience and create a feeling of solidarity and familiarity with the blogger
  • style - it can describe the avoidance of jargon and technical language (due to its audience-restrictiveness) in favor of expressions that evoke spoken language and real-life conversation

As always, feedback is appreciated.

They are what they write

That’s an extremely poignant quote about blogs and bloggers from NYRB’s Sarah Boxer. Read her very insightful piece here (via LanguageLog).

A simple method for recognizing corporate flogs

I’ve been working on a research paper dealing with the language and style of corporate blogs, specifically Life at Wal-Mart, for a few weeks now. My suspicion - that I think I can express with a good deal of certainty now - is that Life at Wal-Mart is a fake blog (or flog) in the sense that it would not be described as genuine by most people who know what a blog is.

How do I know?

Well, there are several indicators. In the 52 entries that I have analyzed there is not a single hyperlink, nor is there any instance where an external source (another blog, news website etc) is quoted.

Not once.

Of course it isn’t impossible for a blog to not link or quote, but it is a severe deviation from the norm. Another thing that struck me (and there are many more indicators that are listed in detail in the paper) is something mind-numbingly simply but quite salient:

Blogs contain the word blog. Life at Wal-Mart doesn’t.

Certainly newspaper editorials and scientific papers are bound to have a higher frequency of the terms editorial and paper than other types of texts, but the number of blogs without the word blog in them beats the statistical significance of that by a huge leap. It’s rather impossible, it seems, to blog without talking about it and when people talk about it they have no choice but to use that term, because it’s the only one we currently have. The New York Times is unlikely to be full of references to the Washington Post for obvious reason, but in blogs mentioning other sources, quoting and linking them is the standard practice. And even if you don’t do it, any sort of reflection on what you’re doing will practically force you to use the word. Blogs that don’t either A) link to or quote other blogs or B) contain some kind of meta-language can be described as virtually impossible and my corpus data reflects that.

So, what does that mean? Well, obviously a blog can be fake and still use the term blog in every single post. But if ad copy, testimonials or other textual building blocks from commercial genres are simply stuffed into Wordpress and the result is called a blog, this method should pick it up.

The elephant in the room: observations on the Google vs. Sicko incident

(Edit: this post by Teresa Valdez Klein on the subject is also interesting.)

The tumult over the whole affair has been impossible to miss. A little over a week ago, Lauren Turner, a health care marketer at Google, wrote a blog entry in which she criticized Michael Moore’s new movie Sicko for its allegedly unfair depiction of health care companies. The piece was posted in Google’s new Health Advertising Blog and led to an outcry. Many in the blogosphere saw Turner’s recommendation to insurance companies - buy ads from Google to fix your image problems - as a sleazy and manipulative form of marketing (samples: this post by ZDNet’s Dan Farber and this bit by Mike Abundo calling for Turner to be fired). The company reacted with two meta posts, one by Turner, explaining that the views expressed in her initial post were purely her own and a second one in Google’s main corporate blog that also sought to douse the flames. Since the incident made Slashdot it can be considered a fairly bad moment for Google’s PR.

Most comments that I’ve read deal with the question of accountability - whose opinion is expressed in an official blog and where do we draw the line between personal opinion and the company’s official stance?

While I also want to deal with that question, my impression is that Turner’s (and thus Google’s) mistake is not firstly the opinion expressed in the post - that Sicko is biased and treats health care companies unfairly - but failing to understand the communicative situation in which the exchange takes place. Turner manifests a fairly stunning lack of knowledge and sensitivity when it comes to blog sociology and that is why the piece caused such an uproar.

Let me elaborate, using several quotes from the post:

Lights, camera, action: the healthcare industry is back in the spotlight. (Not that it ever left the stage.) Next week, Michael Moore’s documentary film, Sicko, will start playing in movie theaters across America.

 

The New York Times calls Sicko a “cinematic indictment of the American health care system.” The film is generating significant buzz and is sure to spur a lively conversation about health coverage, care, and quality in America. While legislators, litigators, and patient groups are growing excited, others among us are growing anxious. And why wouldn’t they? Moore attacks health insurers, health providers, and pharmaceutical companies by connecting them to isolated and emotional stories of the system at its worst. Moore’s film portrays the industry as money and marketing driven, and fails to show healthcare’s interest in patient well-being and care.

 

These are the first two paragraphs of Turner’s piece and careful reading quickly reveals several interesting things. Firstly, the style is very journalesque. The lights, camera, action-enumeration in the first sentence could also be from a movie review or some other traditional journalistic text type (e.g. an editorial).

A slight shift occurs with the first instance of a personal pronoun (us). While the referent of the pronoun is at least somewhat ambiguous, it appears to be what could be called the ‘universal we‘ that Turner uses - legislators, litigators, and patient groups are part of the American public, as are others among us. The referent of others is named a bit later: health insurers, health providers, and pharmaceutical companies are worried about the way the movie depicts them. The important detail here is that Turner does not place the two groups equally for away from herself. She could have simply written others are growing anxious or something similar, but by inserting among us she has placed herself (and arguably her employer) in direct proximity to her potential clients in the health care business. Of course that placement is quite deliberate - she wants to sell ads to these companies, after all - but it soon becomes clear why it is also highly problematic.

Sound familiar? Of course. The healthcare industry is no stranger to negative press. A drug may be a blockbuster one day and tolled as a public health concern the next. News reporters may focus on Pharma’s annual sales and its executives’ salaries while failing to share R&D costs. Or, as is often common, the media may use an isolated, heartbreaking, or sensationalist story to paint a picture of healthcare as a whole. With all the coverage, it’s a shame no one focuses on the industry’s numerous prescription programs, charity services, and philanthropy efforts.

I think you’ll agree that the entire paragraph is essentially a flowery declaration of love for the health care industry. Now, this isn’t surprising per se (again, this is a sales pitch), but the lack of balance is still noteworthy (the nasty press vs. the friendly insurance companies). But wait, there’s more.

Many of our clients face these issues; companies come to us hoping we can help them better manage their reputations through “Get the Facts” or issue management campaigns. Your brand or corporate site may already have these informational assets, but can users easily find them?

Note that here the pronominal references change. We becomes Google and the more distant our clients is replaced by you / your brand. Why is this significant?

Because the post starts out with no clear speaker and referent. There is no “I”, as in “I want to express my views on Sicko and the health care industry today” and no “you” as in “Dear John, how are you ?”. The latter -that there is no clear referent - is perfectly normal for a blog, but the former is unusual. More importantly, these roles are only clearly assigned in the last two paragraphs.

We can place text ads, video ads, and rich media ads in paid search results or in relevant websites within our ever-expanding content network. Whatever the problem, Google can act as a platform for educating the public and promoting your message. We help you connect your company’s assets while helping users find the information they seek.

The pronominal reference at this point is clearly we = Google, you = health care companies. In other words, this is a message from Google to companies in that industry and while other people may also be reading it they are of no concern to the author. When a third party is introduced into the text (the public, later users), it is treated as though it were not a part of the exchange. Apart from pronominal use there are other signature characteristics of the text type that Mrs. Turner had in mind when writing this: verbs such as act, educate, promote, connect and help are indicting, as is the need to tart up nouns adjectivally (relevant websites, ever-expanding content network etc).

If you’re interested in learning more about issue management campaigns or about how we can help your company better connect its assets online, email us. We’d love to hear from you! Setting up these campaigns is easy and we’re happy to share best practices.

This is the equivalent of telling Bob that you think Mary is fat… while she is standing next to you. The public that needs to be educated is the elephant in the room and it doesn’t like to be talked down to. Turner appears to be unaware of this however. She seems to either assume that only potential clients will read the blog and that her pitch will work with them, or (even worse) that the gullible and asinine public will read it but not be offended.

The moral of the story is simple: you should anticipate that your blog is a public forum, no matter how specialized and in-group it may seem. Corporate bloggers should also forget most of what they know about the language of marketing. Certain linguistic tropes (like the aforementioned super-dupering of products via excessive use of adjectives) are recognized immediately and have a lot of potential for negative interpretation.

Delivering a sales pitch like this through a blog is bad enough, priding yourself with how effectively your employer can manipulate the public opinion for the right price is… well, I believe in American English it is called effing stupid. The problem is further aggravated by the fact that Turner’s claim - this is my opinion, not Google’s - is extremely weak.

In all but the last sentence we is the personal pronoun of choice, and that we clearly refers to the company. Obviously, Google as a corporate entity cannot have an opinion, but what is posted in an official corporate blog will understandably be interpreted as noted and accepted by someone further up the ladder (and it seems unlikely that there was no monitoring in Turner’s case).

Not understanding blog stylistics is at least a part of Turner’s failure. She has applied a language common in one context to a completely different and inappropriate one and the result is a bit like someone telling a bad joke aloud at a funeral. Clarifying that your views are your own by using I instead of the collective company we is a decent start.

Thoughts on knowledge blogs and an interview with Tess Ferrandez

And now, after an exciting trip into the world of science blogging, we return to our regular scheduled program.

I’ve been meaning to write something on knowledge blogs (that I’ve previously referred to as expert or industry blogs) as one specific subgenre of corporate blogs for quite a while now. Several recent conversations on the subject have further increased my interest and yesterday I realized that I have been sitting on an exclusive interview with a knowledge blog expert for several months - something that I should absolutely share.

Knowledge blogs are written with the intention of providing insight and information into a topic a company blogger has substantial expertise in. They can be public-facing or have restricted access, but in both cases the target audience is usually a specialized one. A public-facing knowledge blog (or a limited-access blog that allows providing access to affiliates) can be written for customers who seek information and instruction, partners who collaborate in a project, experts at academic institutions, consultants etc. I imagine a typical intranet blog is likely to be more bidirectional than a public-facing one, meaning it is likely to be used for internal communication, partly replacing email, whereas a blog that is accessible to everyone (like the one I’ll present in a moment) is normally used for instruction, making the exchange between blogger and reader more unidirectional.

Software companies like Microsoft, IBM, Sun, SAP and Adobe use public-facing knowledge blogs on a large scale for the purposes mentioned above. The very technical nature of their products makes customer service a largely informational challenge and many of the customers are not end-users, but second-level developers who use specialized development tools to in turn create end-user products.

One extremely successful example of a knowledge blog from the IT sector (and obviously there are many) is If broken it is, fix it you should which is maintained by Tess Ferrandez. Tess is “an escalation engineer in PSS (product support services) at Microsoft, mostly dealing with ASP.NET but anything .NETish works” (from her about page). The application of terms such as “knowledge” and “expert” becomes natural when you take a look at what Tess writes about. To someone not educated in debugging ASP.net applications virtually every sentence in the blog will be completely opaque, but to Tess’ sizable international audience her troubleshooting tips are invaluable.

Independently of whether or not you have a grasp of the subject matter, it becomes apparent quite quickly when reading If broke it is that Tess has a knack for explaining highly complex problems in an accessible way. Another aspect that intrigues me is that she often frames problems in a tone that resembles storytelling - there’s an arc of suspense, from the initial situation (something doesn’t work) to the discovery of the root of the problem and its resolution. Notably this kind of framing is the direct inversion of how issues are presented in a classical knowledge base. Contextual data (e.g. what the engineer thinks or experiences while he is working on the problem) is omitted. There is no sequence of events; instead facts are presented outside of time. For example, compare this entry from Tess’ blog with the knowledge base article it cites. The knowledge base article has no identifiable author (there is no “I”, like there is in the blog) and the sequence of topics does not map to a sequence of events. By contrast, Tess’ debugging examples are narratives; they don’t contain an objectively-detached analysis of a piece of software but the subjective-experiential description of how she approaches, assesses and fixes a problem. We learn by example.

There’s a lot I could write about why I think this is a very promising approach and what it has to do with how we process information, but I’ll save that for another post.

Here are Tess’ answers to 10 questions I asked her via email. I plan to conduct more of these interviews and use them for my thesis, to accurately describe the practitioner’s perspective on corporate blogging.

Once more, I would like to thank Tess for allowing me to interview her.

E-mail interview with Tess Ferrandez

Cornelius: What (if anything) do you enjoy most about blogging?

Tess: I enjoy the instant feedback from people reading the blog, and I enjoy teaching and debugging so blogging is the perfect venue for me to teach debugging and make sure that people don’t have to run into issues that they could easily avoid if they knew about them.

Cornelius: Did someone else encourage you to blog or did you start out of you own accord?

Tess: I started on my own accord, we keep telling customers the same thing over and over in emails and I figured that a) I could avoid having to reinvent the wheel all the time b) other people that don’t call support could benefit from this knowledge and c) if it is documented somewhere people will trust it more since it is something that is already known and not something that was made up to fit the evidence from the dumps.

Cornelius: Do you publish in certain intervals or create a schedule for publication?

Tess: I don’t have a schedule, I blog when I have something that I think is interesting to write about and when I have time to blog. My blog posts are pretty sporadic, one blog post one month and 5 the next.

Cornelius: What prompts you to write a piece?

Tess: When I have had a case that was either extremely interesting or when I find that I see the same issue over and over.

Cornelius: How would you describe your goals when writing a piece?

Tess: My goals are that the posts should be interesting to as many people as possible, so I mostly blog about issues that will affect a lot of different developers. My goals are also that it should be easy to digest while at the same time contain enough detail to be useful, so I structure the content in a way that you can either read it all if you are interested in the details or just read the bottom line if you are just interested in the solution. The primary purpose of the posts are to show common issues and their solutions but also provide debugging tips so that people can resolve similar issues on their own.

Cornelius: Has your employer made any suggestions to you regarding topics that should be avoided (e.g. for legal reasons) or made any suggestions to you on what to blog about?

Tess: Not really, however I avoid four things:

1. Naming customers,

2. Naming 3rd party components

3. Providing information about items that are either confidential or that I know are prone to change to avoid confusion.

These are pretty much the same rules that apply to any communication we have with customers, they expect to be able to trust us so we should not leave out any information about them, and in terms of 3rd party products, if I haven’t tested them myself in a formal way I can’t really expect to be able to express a formal opinion about them.

Cornelius: What kind of reactions do you get from colleagues, clients etc. regarding your blog?

Tess: Only positive, a lot of my colleagues have started blogging after they saw my blog and how many readers I got, i.e. how many people benefit from it, and I have seen a trend of these blogs being very successful.

My blog gets about 100 000 web hits and 400 000 RSS hits a month, and if something I write even helps 1 % of those that would be a good return on investment.

I almost get emails on a weekly bases with positive comments from readers and customers which is extremely encouraging and prompts me to write even more.

Cornelius: Do you put a lot of care into formal aspects like spelling, grammar etc?

Tess: I try not to misspell too many wordsJ but I don’t fret about it too much, after all my blog is not about linguisticsJ

Cornelius: Oh, linguists get these things wrong all the time, don’t worry ;-)

The reason I ask is mainly because some people (Robert Scoble, for example) say that to them blogs are conversations, so that in contrast to expository writing where you check, revise and edit a lot it’s mostly about speed and efficiency.

Your posts are very informational and complex and thus you probably spend more time planning and editing than someone like Scoble, who posts 4 or 5 very short pieces per day.

Cornelius: Has your approach to blogging changed over time?

Tess: Yes and no, after writing a lot of posts I can tell which posts are going to get a lot of hits and which ones aren’t, and also what people tend to search for when they get to my blog, so I try to keep titles etc. relevant so that more people can reach it and see immediately if it is relevant or not.

Cornelius: Do personal experiences play a role in your blogging?

Tess: I am not sure how to answer that. My blog is about personal experiences with issues that I have worked but I am not sure if that is what you are looking for.

Cornelius: My bad, the questions wasn’t phrased very well. What I meant was: do you ever refer to things that aren’t strictly work-related, things that you would describe as personal? Obviously you don’t post pictures of your cat (though some tech people do) but do you ever use anecdotes or stories in your posts?

Tess: I would say no, I don’t post much about personal experiences, in fact I think the only personal post I have made so far was when I got blog tagged.

The main reason is because I don’t think that is what people reading my blog are interested in, but having said that I would use personal references if it adds to the story, i.e. if something in my personal life could act as an analogy to explain something complex.

I do add a lot of personal comments though to make the posts more readable because I don’t want them to be stale and dry, but on the other hand I would never tell stories about my family and friends in the blog because I want to keep it informational rather than “here is what i did today”.

Give me your blog and I’ll tell you who you are

Occasionally, like most nerds in academics, I wonder where exactly the usefulness of my research lies. What I do is fairly applied (in contrast to, say, theoretical syntax) but it still isn’t purely about solving real-life problems. Some people think that that’s a bad thing, while others completely oppose the view that science should generally be concerned with solving real-life problems.

Anyway, I recently came across a piece of applied research that’s both very interesting and mildly scary in its implications (although not very surprising if you’ve done research in the area in question).

The paper Effects of Age and Gender on Blogging by Schler et al is a study of roughly 140 million words of running text by close to 20,000 bloggers. The authors of the paper explain their questions and objective as follows:

How do content and writing style vary between male and female bloggers and among bloggers of different ages? How much information can we learn about somebody simply by reading a text that they have authored? These are very basic questions that are both of fundamental theoretical interest and of great practical consequence in forensic and commercial domains.

Note that the authors aren’t referring to human readers here, they are talking about natural language analysis using a computer. What they’ve done is to look at whether certain patterns in language use correspond with gender and age groups in a systematic way. In other words, is there a typical way of writing that distinguishes 14-year-old girls from 45-year-old men? If there is, it means you should be able to predict the age and gender of an author given enough textual material. Schler et al looked at several kinds of features for their analysis and found that they could predict the age and gender of bloggers with 70%-80% accuracy – or more, in some cases.

Here’s one of their observations on gender:

First, note that for each age bracket, female bloggers use more pronouns and assent/negation words while male bloggers use more articles and prepositions. Also, female bloggers use blog words far more [things like “lol” and “ur”] than do male bloggers, while male bloggers use more hyperlinks than do female bloggers. All of this confirms and extends findings reported earlier in [1,5,7] and lends support to the hypothesis that female writing tends to emphasize what Biber [3] calls “involvedness”, while male writing tends to emphasize “information”.

The results are equally stereotypical when it comes to characteristic content words for gender and age groups. Males talk about gaming, google, india and democracy, while shopping, cute, boyfriend and pink give away females. Teens are prone to use homework, boring, crappy and mum while twens like bar, apartment, beer and dating. And once we’re in the 30s we’re suddenly more interested in marriage, tax, son and development. Note that these words are used more often by one gender/age group than by others, not that girls write only about shopping. This also explains the typicality of a word such as boyfriend – teenage girls are more likely to use this than other groups simply because they are more likely to have boyfriends than males.

In another table, Schler et al look at which words are most strongly gendered, i.e. most likely to be used significantly more often by members of one sex over the other. According to their results, men use the words money, job, sports and tv more often, while women use sleep, eating, sex, family, friends and words that express positive or negative emotions more frequently than the mean.

So where does that leave us? Do we have to feel depressed over the fact that apparently our grammar gives away who we are? What about our individuality?

I don’t think it’s a big deal, for two reasons. Firstly, these are averages and their goal is to describe what is typical, not how you or I really write. Secondly, the goal here is to identify authors by age and gender in ambiguous cases (for example in forensic linguistics), not to make any blanket label judgements about men and women or teenagers and grown-ups.

But of course it’s interesting to see that the way we express ourselves can be so very markedly gendered. Where does the male preference for articles come from? Why do females use more pronouns? And does our writing style really become “more male” as we age, as recognized by Schler et al?

Another point to note is that prepositions and articles, which are used more frequently by male bloggers, are used with increasing frequency by all bloggers as they get older. Conversely, pronouns, assent/negation words and blog words, which are used more frequently by female bloggers, are used with decreasing frequency as bloggers get older. In short, the very same features that distinguish between male and female blogging style also distinguish between older and younger blogging style.

Or perhaps this just supports the idea that teenage girls have a unique subculture that sets them apart linguistically…

What McDonald’s is saying

Thought it was time to share a few blog-related statistics with you once again. I’ve looked at three different things in McDonald’s Open for Discussion blog using my corporate blogging corpus.

a) f-score (for details on what that is, read this post)

b) most frequent nouns

c) collocates of the noun PACKAGING

For visualizing f-scores and noun frequencies I’ve once again used IBM’s nifty Many Eyes tool. Have a look.







For the third step (collocates) I’ve used what’s called a concordancer in linguistics to look at the contexts where the noun PACKAGING is typically used.

Concordance for PACKAGING

1. Designing Packaging With the Environment in Mind in Open for Discussion

asked how we address sustainability issues in designing our packaging. I’m happy to jump on this question because we’ve b

2. Designing Packaging With the Environment in Mind in Open for Discussion

at McDonald’s as the manager for initiatives to reduce our packaging impacts. The short answer is that we study the pote

3. Designing Packaging With the Environment in Mind in Open for Discussion

hat we study the potential environmental impacts of any new packaging design. We work hard to ensure that our packaging w

4. Designing Packaging With the Environment in Mind in Open for Discussion

f any new packaging design. We work hard to ensure that our packaging will be environmentally responsible while also meet

5. Designing Packaging With the Environment in Mind in Open for Discussion

onable cost. Our main environmental priorities for consumer packaging include:- Minimizing use of materials. - Favoring m

6. Designing Packaging With the Environment in Mind in Open for Discussion

ls made from renewable resources, like wood fiber. - Having packaging that can be recycled or composted. - Incorporating

7. Designing Packaging With the Environment in Mind in Open for Discussion

ckage was featured awhile ago, as an example of sustainable packaging, in a trade magazine called packaging World. I wi

8. Designing Packaging With the Environment in Mind in Open for Discussion

mple of sustainable packaging, in a trade magazine called packaging World. I wish I could take you back to the late 198

9. Designing Packaging With the Environment in Mind in Open for Discussion

d take you back to the late 1980’s so you could compare our packaging then to what it is today. There are big differences

10. Designing Packaging With the Environment in Mind in Open for Discussion

ese solutions–and many others–by working with our primary packaging supplier and Environmental Defense. At the end of

11. Designing Packaging With the Environment in Mind in Open for Discussion

the 1990’s, we calculated the results of our collaborative packaging efforts. They showed that we’d eliminated 300 milli

12. Designing Packaging With the Environment in Mind in Open for Discussion

rts. They showed that we’d eliminated 300 million pounds of packaging during the decade. And that was in the U. S. alone.

13. Designing Packaging With the Environment in Mind in Open for Discussion

next time you eat at McDonald’s, take a closer look at our packaging. And let me know what further questions you have. -

14. Saving the Earth and Saving Money in Open for Discussion

he decade, we’d eliminated a total of 300 million pounds of packaging. And it didn’t cost us a penny. We also worked with

15. Greening Our Supply Chain in Open for Discussion

out our sustainable fisheries program and our work with our packaging supplier. Let me tell you about another initiative-

16. Engaging in the Global Obesity Dialogue in Open for Discussion

rant chain to begin providing nutrition information on food packaging. We are using a simple icon-based format because we

17. We Want You… To Critique Our Worldwide Corporate Responsibility Report in Open for Discussion

ompany?- Limited consumer interest?- Limited technologies - packaging…- Established food production systems - price set

18. We Want You… To Critique Our Worldwide Corporate Responsibility Report in Open for Discussion

know more about your plans for new nutrition information on packaging to be in 20, 000 retaurants worldwide by end of 200

19. Spinning Green in Open for Discussion

me period, McD USA spent more than $1/2 billion on recycled packaging With a leading environmental organization’s help, ou

= 19 matches in 6 blog posts for your query

I’ll omit a detailed commentary this time, just take this as a sort of text-statistical doodle. Oh and I picked OFD for no particular reason - the blog just happened to pop up in my records.

Who’s afraid of apostrophes?

Just picked up this little piece of spelling advice from Seth Godin. Quote Seth:

That’s the primary function of the apostrophe–to expose apostrophe ignorance.

You get no points for using one right, and lose big points when you market any idea while using them wrong. It doesn’t take long to check (especially in a headline or even worse, when designing a sign) and it’s worth it. The Marriott in Boston spent a fortune in interior decoration, and then decides to invent a whole new word.

My apologies, but I can’t resist playing Mr. Smartypants here - call it an occupational hazard.

The function of the s-genitive in English is to mark possession or relation, allowing me to refer to the vehicle owned by John as John’s car. The other construction that basically does the same thing is the of-genitive as in the mayor of New York. Note that New York’s mayor would be just as acceptable to most speakers, but the car of John seems at least a bit odd. If you are intrigued as to how this works, have a look at this seminal book on genitive variation by my colleague Anette Rosenbach.

But back to Seth’s restroom sign. The mistakes associated with the s-genitive come from its closeness in form to the most common plural marker in English, which is simply -s, with no apostrophe, as in one cat, two cats. This is annoying and confusing for a bunch of reasons.

1. In writing, the way the genitive is marked is as messy as can be. Usually we just stick ’s onto the end of a word, but when the word itself ends with s we only attach an apostrophe (e.g. Cornelius’ blog is boring to read). This is no clear-cut distinction though - some people would prefer Cornelius’s blog is boring to read. Beyond that there are cases where there appears to be a genitive, but the spelling doesn’t reflect it. Seth ironically closes his blog entry with the words

Sincerely your’s, Seth

which is of course incorrect - we write yours and its in English, though technically it would have to be your’s and it’s. The reason the latter versions aren’t used is that the apostrophe already plays another role in them. It’s is expanded to It is and your’s somehow suggests your is, which would be an irregular use of be. Thus yours, its, hers and his express possession without using an apostrophe in their spelling.

2. Next up, phonetically there is usually no distinction between s-genitive and plural, making anything marked in either way potentially ambiguous. Think about hearing the incomplete sentences “the cute cats…” versus “the cute cat’s…”. Without the ending of the sentence it would not be possible to tell whether this refers to several cats that are cute or something that a single cute cat has.

3. Furthermore, it is not phonetically discernible when something is genitive-marked and plural-marked at once. Take this utterance:

Jane owns two felines, Hairball and Garfield. The cute cats’ home is a typical mid-sized suburban apartment.

The cute cats’ home here means the home of Hairball and Garfield, something we can narrowly recognize just by looking at the spelling of cats’, even though that spelling is already very idiosyncratic. (Shouldn’t it be cats’s? As noted above, some people indeed think it should.) But in spoken language it is impossible to pick this information up. Without the first sentence we are not able to tell apart +GENITIVE +PLURAL and +GENITIVE -PLURAL (note that -GENITIVE +PLURAL is syntactically impossible in the example).

4. In the case of women’s (or womens, as the sign says), things are even slightly more complicated. The common way of marking the plural in English is by sticking -s onto the ending of a word, but there are less common alternatives, most of which are referred to as “unproductive” by linguists, because they are no longer used with newly coined words. Among these unusual endings are gems like -en (one ox, ten oxen) and a whole heap of endings for words with Latin and Greek origins such as criteria and alumni. Women has what is called an umlaut plural, in which a vowel sound is shifted, usually to a higher position, to indicate plurality.

5. Finally, what makes things just a tad quirkier is the question of whether or not we need a genitive at all here. In my own native language, German, the word for women’s restroom is Frauentoilette (the more polite Damentoilette, meaning ladies’ restroom, is more common, but the first version would certainly be understood). Frau means woman, Frauen means women and toilette means toilet. So the literal translation from German to English would be women restroom, where women modifies restroom to denote a specific kind of restroom. Since the women’s restroom isn’t exactly owned by women, one could argue that the genitive is obsolete, because we’re not indicating possession. It certainly isn’t essential to tell us what we want to know. It’s simply a convention of the English language.

So the restroom sign at the Boston Marriott has fallen victim to confusion on potentially several levels. Most likely the person who made the sign simply wrote it down as you would say it, not marking the distinction between plural and genitive correctly (note that this doesn’t mean the writer confused the difference between genitive and plural, but how to distinguish the two in spelling). Since the plural is already unambiguously marked via the umlaut, the -s at the end basically has to mark possession, apostrophe or no apostrophe. And even without any indication of a genitive, people would still understand what the sign refers to. It’s a thin line: I can state that I am storing files on my computer’s hard drive or on my computer hard drive and the meaning wouldn’t change in a way that would really affect what I am saying.

Now, of course Seth’s point was that such a mistake is sloppy and makes the hotel look bad and that’s perfectly true.

Diverging from standard language is risky because all sorts of associations - sloppiness, lack of education or even intelligence - are easily evoked when one makes certain errors. But then there are also examples where an unconventional way of writing eventually becomes mainstream. I’m not sure if McDonald’s introduced the simplified spelling of thru (instead of through) in the word drive-thru, or if the term has other origins, but it can certainly be called a success. There are many good reasons for using a modernized spelling for words ending with -ough and who knows, perhaps that will eventually happen in mainstream usage. While the situation is different with the genitive, the fairly high frequency of errors made by native speakers is a perfect indicator for the lack of coherence and clarity when it comes to “the rules”.

McDonald's drive thru

I am a hard bloggin' scientist - read the Manifesto Subscribe to the CorpBlawg Feed