Jul 4th, 2007 | Linguistics, Other Stuff, Robert Scoble, Technology, Web 2.0 | 2 Comments
Robert Scoble likes Google better than Microsoft (but not much) - and I have proof for that. He also holds his wife Maryam dearer than his company PodTech, but sadly she is outranked by Twitter and Apple. Ah, cruel World 2.0 capitalism.
How do I know? Simple, I have a list of 1,587 posts with 273,994 running words of text that Mr. Scoble has produced between 2 Aug 2006 and 4 Jul 2007. That translates into 18,362 sentences. An average Scoble blog entry has a length of 172.6 words, with 14.9 words per sentence and an average word length of 3.8; all of which is fairly - deep breath - average for a blog.
All, except for the word count. It’s pretty impressive, especially when you consider that he’s been at it for almost 6 years (I believe he started in October 2001 - correct me if I’m wrong). That’s 69 months of blogging, which translates into an estimated staggering 1,65 million words. That would make him twice as productive as William Shakespeare, who (only) managed 884,647 words in his entire lifetime, though in all fairness it has to be noted that Mr. Scoble didn’t have to write all that with a quill pen.
And here are his favorite nouns, by frequency (the number after the word indicates how often in occurs).
1 Google 1015
2 blog 779
3 Microsoft 776
4 people 688
5 video 503
6 stuff 393
7 things 365
8 something 357
9 way 354
10 Web 343
11 lot 322
12 today 320
13 time 301
14 thing 290
15 link 280
16 Apple 267
17 week 259
18 Search 258
19 world 256
20 post 245
21 videos 229
22 bloggers 220
23 interview 217
24 Twitter 215
25 blogs 213
26 company 206
27 one 199
28 Maryam 199
29 update 197
30 day 195
31 fun 193
32 someone 192
33 news 190
34 team 185
35 companies 178
36 lots 177
37 iPhone 175
38 service 172
39 Steve 171
40 show 171
41 site 170
42 TechMeme 169
43 business 165
44 phone 160
45 Windows 159
46 conference 158
47 year 158
48 PodTech 153
49 minutes 153
50 developers 151
Apr 10th, 2007 | Corporate Blogging, Debbie Weil, Fake Blogs, Gourmet Station, PR, ROI of Blogging, Robert Scoble | 1 Comment
That’s the title of a great 1997 album by Blonde Redhead and as it happens, it is also today’s topic - just in a way not related to alternative rock, but to (corporate) blogging.
Here’s the thing: it never ceases to intrigue me how often I come across blogging-related advice. There’s no shortage of suggestions, guidelines and even rules out there - rules that are often considered absolute and inviolable by those who postulate them. Often suggestions from perceived authorities such as Robert Scoble and Debbie Weil on how to blog are interpreted as dogma; for example, the maxims that blogs are personal, that you must be transparent and so forth have all become pervasive*. How often have you read that a blog is a conversation, or that misleading readers about the identity or motives of the blogger is immoral?
I don’t want to challenge any of these ideas, but I do want to make a distinction between the different shades of meaning of the words blog, blogging and blogger, because it is hard to talk about something when you lack a consistent definition. I also want to question the validity of the judgment that certain blogs are “fake”, or at least ask whether that’s really a bad thing.
Blogging is understood alternately understood as
a) the use of a publishing technology
b) the style in which blogs are often written
c) the type of social interaction between the blogger and his readers
and often - but not always - it is the combination of all three of these things. Note that they build upon each other: a bloggy style makes limited sense when you’re writing a letter (using another publishing technology), because even though the two types of text share several common traits they also differ significantly in other regards.
Say you’re a Java developer who likes to write about coding, snowboarding in the Rockies and Frank Miller comic books. You’ve set up an installation of Wordpress on your own webserver and publish your first entry. It could start like this:
Hey everyone! So, guess what, I’ve decided to start a blog too. I’ll post here from time to time to talk about whatever catches my interest […]
Even with just a handful of words, it can be clearly established that this kind of writing appeared in a blog and not, say, a newspaper, a personal diary, or a speech, even though it contains elements that are also common in these genres (of course it has the word “blog” in it, but even without that keyword I think a classification is possible). Now imagine that you’re a loyal reader of this blog and one day you find out that your snowboarding hacker friend is actually an invention - a fictional character developed by the department of systematic deception (DoSD) of a global PR firm (let’s call it Noble PR).
How would you react to this piece of information?
I think one gets a good idea of how people feel about these things when looking at blogs like this one and reactions such as these (read the first few comments). Blogs like Gourmet Station’s have been widely criticized for “violating the rules” and “being fake”. Where do these sentiments come from? They are the result of a holistic interpretation of blogs as a specific combination of a publishing technology, a style of writing and a kind of social interaction (a + b + c; see above). In other words: if you run a blogging software, write from a first-person viewpoint and directly address your readers, it is assumed that you are a real person, because only real human beings can engage in such an interaction (meaning a + b implicates c).
There are good reasons why you might want to use a blog as a publishing tool without writing in a bloggy style or allowing comments from your readers. Tools such as Wordpress and Movable Type are used for everything from publishing poetry to managing entire websites and their versatility makes “non-traditional” usages plausible. But the Catch 22 appears to be style: if a writer makes frequent use of the first-person pronoun, vocatives, interjections and other stylistic elements that are traditionally frequent in spoken language in what looks like a blog in terms of presentation, it must be assumed that he is communicating with me, because that is how a typical blog works.
Social interactions of even the simplest type represent an investment for the participants. I react to you in a certain way because I have assumptions both about you and about your assumptions about me. If my assumptions turn out to be unfounded, the result is a loss of face. Nobody wants to deal with someone who isn’t honest about their identity.
The special thing about blogs is that the technological frame they live in makes it especially plausible to assume these things. Nobody finds the conversational style described above terribly confusing or irritating in a novel, despite the fact that we usually know the difference between the voice of the author and the voice of his fictional characters**. But the difference is that I can’t interact with the author when reading a novel and thus there is very little likelihood that I’ll mistake what is going on for a real instance of communication that somehow involves me.
So where does that leave us? And why is the title of this post “fake can be just as good”?
Despite the outrage two years ago, the fictional T. Alexander still blogs for Gourmet Station and the blog has a PageRank of 5 out of 10 (this site has a mere 3). It shows up in fourth place if you google for “gourmet blog” and, according to Technorati, almost 400 links poin there. Finally the Northeastern University/Backbone Media Study lists it as an example for successful corporate blogging.
Here’s a (rather long) excerpt that provides an excellent picture of Gourmet Station’s approach to the blog (taken from the study):
Donna described how everything on the blog has to be consistent with the brand. She moderates the comments and makes sure those comments are consistent with the brand. No profanity or unrelated comments are allowed on the blog. Donna explained that “everything has got to be very buttoned up, we have a very buttoned up brand, and we have a very upscale brand, very upscale, well educated customers. So anything that goes out there has to be consistent with that.” The blog also allows the company to discuss their content in a laid back tone. That content has produced higher rankings on search engines and helped to increase traffic to the blog by 10%.
Donna believes it to be important that the people who write on the blog are knowledgeable about food and wine. The blog’s readers are looking for ideas around food, drink, and entertainment.
The blog has helped Donna’s company add content to their website on the topics and products the company is focused on providing. Also, the blog has given Donna the ability to place content that they otherwise would not have been able to put on their website. Donna said it was important that a company covers all of the topics they wish to cover in their blog posts, and to categorize those topics by keyword.
The Gourmet Station blog has achieved a number two ranking on the keyword “gourmet dinners” in Yahoo! The blog has played a big part in helping the company to achieve that ranking. According to Donna, the blog has also helped establish the company’s brand and provide more sales conversions by making a “passionate connection” with readers.
The topic that generates the most conversation and interaction from readers on the blog is romance. Donna said that made sense, as the search volumes for romance and dinner have a great connection.
Donna selects the content of the posts by season. Donna said the blog has 14 categories, and the company always has a recent post in each of the categories.
Donna recommends a company have a strategy before starting to blogging. Her company has two strategies: to fill their categories with content and to increase they’re (sic) ranking on search engines.
The bottom line appears to be: Gourmet Station designed a blog to increase search engine visibility and to publish material that did not fit into the context of a traditional corporate site. Perhaps they felt that this material was too context-dependent (recipes for seasonal gourmet foods, etc), or that a less formal style of writing was needed, but only in a certain limited area and not for the entire site. Whatever their motivation - there is hardly a rational reason to argue against their success. Whether “fake” or “real” (note the quotes), it appears that different strategies can realize different goals for different people.
I’m pretty sure that examples such as the Gourmet Station blog will remain marginal, though. It’s not really because of the outrage “fake” company blogs generate (is there such a thing as bad PR?), but because it seems somewhat contrived and unnecessary to come up with a fictional character to write your blog when you might just as well have a real person do it. It’s not too hard to stick with The Message even when you’re blogging under your own name - numerous product blogs out there prove that. How you measure success is an entirely other question. In that context, note Gourmet Station’s specific goals of increasing visibility and publishing “unconventional” content.
So there it is. You can blog, or you can publish via a blog. Or you can do the latter and hope that people will believe it’s really the former. Not much shame in that, I think.
* The single most important document in this context is probably Scoble’s Corporate Weblog Manifesto, which has seems to have influenced most subsequently formulated blogging guidelines.
** Of course this is systematically exploited in literature, for example in epistolary novels. Playing with the status of a piece of writing as ambiguously real or fictional was also a hallmark of Postmodernism.
(Edit) Here are a few more interesting links I initially forgot to include: one, two, three.
Feb 9th, 2007 | Corporate Blogging, IBM, Jonathan Schwartz, Microsoft, Robert Scoble, Style, Thesis, Visualization | 13 Comments
I’ve been playing around with this great little tool for several days now and thought I’d share some of the results with you.
But first, here’s a brief recap of what I’ve been doing before I start throwing statistics at you.
I am in the process of building a textual database (or corpus, as linguists call it) of corporate and enterprise web logs. The purpose of this corpus is to investigate corporate blogs as a text type. In the current phase of my research, I am especially interested in the following questions
- how do corporate blogs compare stylistically with non-corporate blogs, news texts and other types?
- is there a typical ‘corporate blogging style’ in terms of how people write?
- are there recognizable differences in style that correspond with differences in purpose or authorship (in other words, do CEOs, marketers, software developers, etc have distinct styles?)
- how much variation is there stylistically between different blogs, different bloggers in the same hub (e.g. MSDN) and between different posts by the same blogger?
- are there patterns of change in style over time?
You might wonder what such a description is good for (well, apart from furthering the pursuit of knowledge and all that). I think that, on the practical level, it will enable us to better understand what people are trying to achieve with blogs and how they do it. Ultimately blogging is about good writing. The trouble is, neither is ‘good’ easily defined, nor is it always the same to everyone on any occasion. Blogging styles are highly dynamic and situation-dependent and I think the most successful bloggers very consciously adapt different styles to address different people and issues.
Right, so what do I have so far?
One of the first measures I’ve implemented into my database is a relatively simple formula for calculating how formal/informational or (on the other end of the scale) involved/context-dependent a text is. This is done by adding the frequencies of certain types of words together and subtracting others, under the assumption that (for example) nouns are more numerous in texts which are primarily informational, while a high frequency of pronouns indicates involvement. The formula looks like this:
(see Heylighen and Dewaele 2002)
As you can guess, the results are potentially ambiguous - in other words, texts can have a very high or low score for a variety of reasons - and should be used with care. That being said, the measure produces some pretty interesting results.
This is a chart of f-scores from Robert Scoble’s blog


Each data point in the graph is the f-score for a single post, or the average for several posts made on a single day. As the graph shows, Scoble’s posts are fairly consistently in the 50s in August and September. They surge to over 100 in mid-October and make overall gains in November and December, though these gains aren’t really as significant as they might look at first. The more notable change is the high degree of variation in these months compared to the time span before that.
You might wonder which posts exactly get a high or low f-score. Here are the entries with the highest score, by date.
Comparing new TailRank/DiggTech/TechMeme to Google Reader, 16 October 2006 (f-score 102)
Grapes on a Plane, 29 October 2006 (f-score 97)
The highs and lows of CES, 15 January 2007 (f-score 93)
Photo “training”, 21 January 2007 (f-score 106)
If you have a look at those posts, you’ll probably notice that they aren’t really in any way more formal than Scoble’s other writing. The difference is that they tend to be more informational, i.e. have more and more condensed information crammed into to them than most entries. Lists and enumerations will immediately lead to a high score (because they usually translate into a high noun count) and for Scoble those entries which are written in a sort of telegraph style to convey information about a photowalk or CES thus have a high score. This doesn’t really demerit the f-score as a metric - it simply means that it’s context-sensitive. What’s important is that, with an overall mean score of 60, Scobelizer ranks on the extreme low end of the formal/informational vs involved/contextual scale. To Scoble, blogs really are conversations, not just metaphorically but in a quite literal stylistic way.
That’s the score for one source over time. Let’s compare a bunch of sources.


If you have trouble seeing anything on the chart, look for a little dropdown menu on the lower right hand side labeled dot size. Change it from ‘posts’ to ‘no selection’ and all the dots will be changed to have the same size, which should make the whole thing a lot easier to read.
The chart is a representation of scores for 137 different blogs, computed from data collected during the last five months. Each dot represents a single blog and its average f-score on the x axis. The position of a dot on the y axis indicates the standard deviation of values inside of that blog, i.e. the degree of internal variation
The vast majority of the sources I’ve used are corporate blogs - after all that’s what my research is about. But in addition I’ve also thrown in a few non-corporate sources, simply to be able to compare one type of blog with another one. Thus the list contains 17 personal blogs randomly found via blogger.com, 1 a-list professional blogger (Scoble), 1 political blog hub (huffingtonpost.com) and 3 non-blog sources, namely editorials from the New York Times, the Washington Post and the LA Times collected in the course of this week (see below for a full list of sources).
The first thing likely to catch you eyes are the outliers. On the far right hand side, there is one source simply tagged “Blog” (informative, I know) with a record f-score of 195 and and a standard deviation of 92. That’s Ray Ozzie, Chief Software Architect of Microsoft. Now, if you have a look at his blog you might find that the best description for his writing is not so much formal, but rather “technical” or maybe “information-oriented”. The reasons for the high scores are the many compound nouns (things like development ecosystem, application components, clipboard data formats, etc) coupled with the overall significant length of entries. Like the other outlier, Irving Wladawsky-Berger of IBM, Ozzie also produces very long posts. Ozzie’s longest has 1,700 words, while Wladawsky-Berger is a close second with 1,500. Length tends to coincide with somewhat higher f-scores, however, there are counter-examples. Heather Hamilton has one post with a whopping word count of over 2,000 and an f-score of only 105. Generally brief posts tend to coincide with lower scores, but, as the example shows, there are exceptions.
Overall it is important to consider a few things, especially in regards to the those sources with a high standard deviation and a high f-score:
- the deviation is often high simply because there aren’t many posts (for example, Ozzie only has 6 entries)
- several of the high-deviation blogs are hubs, i.e. they aggregate a number of individual blogs (e.g. MSDN and HuffPo)
But the cool part is that the remaining sources usually contain very conscious stylistic variation (Jonathan Schwarz is a prime example). I other words, they write differently to address different people and achieve different things and this - at least to some extent - stylistically visible. Compare that with the scores for the three newspaper editorials grouped together in the lower right area of the plot. They are surprisingly consistent if you consider that we’re looking at texts published in three different papers, written by an even larger number of journalists. Which just shows that the editorial is a pretty solidified type of text in terms of style, while the (corporate) blog isn’t - at least not yet.
Anyway, I’ll wrap it up for now and save the more in-depth look for another post.
Sources
iUpload InSights
http://hopper.iupload.com/default.asp
Time Leadership
http://www.jimestill.com/
I Love Me, vol. I
http://www.michaelocc.com/
Simply Albert
http://simplyalbert.blogspot.com/
ChristianLindholm.com
http://www.christianlindholm.com/christianlindholm/
PR Thoughts
http://www.prthoughts.com/
Occam’s Razor
http://mgoldberg.typepad.com/occams_razor/
Loic Le Meur Blog
http://www.loiclemeur.com/
CTO Blog
http://www.capgemini.com/ctoblog/
Lakattack
http://spreadlog.net/
Marcel Reichart Blog
http://marcellomedia.blogs.com/mrb/
stefan
http://stefan.21publish.com/
Amazon Web Services Blog
http://aws.typepad.com/
Cisco High Tech Policy Blog
http://blogs.cisco.com/gov/
Digital Straight Talk
http://www.digitalstraighttalk.com/
Direct2Dell, Dell’s Weblog
http://www.direct2dell.com/default.aspx
eBay Developers Program
http://ebaydeveloper.typepad.com/
EDS’ Next Big Thing Blog
http://www.eds.com/sites/cs/blogs/eds_next_big_thing_blog/default.aspx
From Edison’s Desk - GE Global Research Blog
http://www.grcblog.com/
Real Baking with Rose Levy Beranbaum
http://www.realbakingwithrose.com/
GM Fastlane Blog
http://fastlane.gmblogs.com/
Google Blog
http://googleblog.blogspot.com/
Dan Socci’s Blog
http://h20325.www2.hp.com/blogs/socci
Kara R
http://www.honeywellblogs.com/kara_r/
ING Asia/Pacific’s Blog
http://mycupofcha.ingblogs.com/
TinyScreenfuls.com
http://www.tinyscreenfuls.com/
Open for Discussion
http://csr.blogs.mcdonalds.com/default.asp
One Louder
http://blogs.msdn.com/heatherleigh/
NIKEBASKETBALL
http://blog.nikebasketball.com/
OraBlogs
http://www.orablogs.com/orablogs/
Things That Make You Go Wireless
http://businessblog.sprint.com/1/1/
The Lobby from SPG
http://www.thelobby.com/
Jonathan Schwartz’s Weblog
http://blogs.sun.com/jonathan
Texas Instruments Video360 Blog
http://blogs.ti.com/
The Jason Calacanis Weblog
http://www.calacanis.com/
Boeing Blog: Randy’s Journal
http://www.boeing.com/randy/
Guided By History
http://blog.wellsfargo.com/guidedbyhistory/
PlayOn
http://blogs.parc.com/playon/
Yahoo! Search Blog
http://www.ysearchblog.com/
The CEO’s Blog - John Mackey
http://www.wholefoodsmarket.com/blogs/jm/
Blog
http://www.nixonmcinnes.co.uk/about-us/blog/
Kate’s Blog
http://katesblog.u3.com/
The Bocada Blog
http://bocada.typepad.com/bocadablog/
Michael M’s X10 Blog
http://www.x10community.com/michaelm/
Notes from MNR
http://blogs.adobe.com/notesfrommnr/
Entrepreneurial Marketing
http://blogs.accenture.nl/EntrepreneurialMarketing/
TiVo Blog
http://blog.tivo.com/tivo_blog/
Guiness Blog
http://www.guinnessblog.co.uk/blogs/home.aspx?App=guinnessblog&allowAccess=4r7a6h
Hu Yoshida’s Blog
http://blogs.hds.com/hu/
Forta Blog
http://www.forta.com/blog/
Novell Open PR
http://www.novell.com/prblogs/
Jeff Jaffe’s Blog
http://www.novell.com/ctoblog/
Blog
http://rayozzie.spaces.live.com/blog/
Mena’s Corner
http://www.sixapart.com/about/corner/
Alan Meckler
http://weblogs.jupitermedia.com/meckler/
Infrablog
http://blogs.verisign.com/infrablog/
Thompson Holidays Blog
http://thomsonholidays.blogs.com/my_weblog/
Baby Babble
http://stonyfield.typepad.com/babybabble/
The Bovine Bugle
http://stonyfield.typepad.com/bovine/
Stone Creek Coffee Blog
http://sccv3.stonecreekcoffee.com/blog.cfm
bugBlog
http://rescuebugblog.typepad.com/rescue_bugblog/
Speaking of Security
http://www.rsasecurity.com/blog/
Hybrid Talk
http://hybridtalk.nyse.com/
Jonathan Bruce’s WebLog
http://jonathanbruceconnects.com/jonathan_bruce/
The Tinbasher Sheet Metal Blog
http://www.butlersheetmetal.com/tinbasherblog/
The NCC Weblog
http://www.northfieldconstruction.net/
Signs Never Sleep
http://signsneversleep.typepad.com/
ACCAbuzz
http://www.accabuzz.com/
English Cut
http://www.englishcut.com/
Life at Wal-Mart
http://walmartfacts.com/lifeatwalmart/
Scobelizer
http://scobleizer.wordpress.com/
The DustBlog
http://thedustblog.blogspot.com/
The Baby Blawg
http://babyblawg.blogspot.com/
life’s short…make it sweet…
http://dunlin.blogspot.com/
xbsg
http://mi50.blogspot.com/
I am the evil master genius
http://arnique.blogspot.com/
i want you
http://nuratikahnabilah.blogspot.com/
44 Words for 365 People
http://44for365.blogspot.com/
neurotic kitten
http://nkitten.blogspot.com/index.html
Discover Norwegian Music
http://discovernorwegianmusic.blogspot.com/
my smiles arent a facade
http://badass-freak.blogspot.com/
�?ů�?ð£з �?�? Ŧ�?ǿůĝ�?ŧ�?
http://chibinyu.blog.com/
Flying Tragic
http://tragicflyer.blog.com/
The Irony of Life
http://mujerlatina319.blog.com/
cudgeland
http://cudge.blogspot.com/
Over the Horizon
http://blogs.zdnet.com/OverTheHorizon/
DaveBlog
http://blogs.netapp.com/dave/
Earthling
http://blogs.earthlink.net/
developerWorks blogs
http://www-03.ibm.com/developerworks/blogs/
Irving Wladawsky-Berger
http://irvingwb.typepad.com/
Forum Nokia Blogs
http://blogs.forum.nokia.com/author_group.html?id=2
Nokia N90 Blog
http://n90.bloggercomm.com/
Sparkle Like The Stars
http://www.sparklelikethestars.com/
FYI Blog
http://fyi.gmblogs.com/
Southwest Airlines Blog
http://www.blogsouthwest.com/
Benra Blog: ZoomAlbum, Photos & Photo Sharing
http://benra.typepad.com/
WeatherBug Corporate Blog
http://blog.weatherbug.com/
CTO Blog - TalkBMC
http://talk.bmc.com/blogs/blog-bishop/cto/
Commentary from Cape Clear’s CEO […]
http://www.capeclear.com/annrai/
QuickBooks Online Edition The Team Blog
http://quickbooks_online_blog.typepad.com/
The QuickBooks Team Blog
http://www.quickbooks.blogs.com/
The Mindjet Blog
http://blog.mindjet.com/
Warehousing and Distribution
http://thirdpartylogistics.blogspot.com/
The Official Salesforce Blog
http://blogs.salesforce.com/
Park City Mountain Resort
http://parkcity.typepad.com/park_city_mountain_resort/
SunbeltBLOG
http://sunbeltblog.blogspot.com/
TaylorMade Blogs
http://www.taylormadeblogs.com/
Scenic Nursery Gardening Blog
http://www.scenicnursery.com/
Lightning Labels Blog
http://lightninglabels.typepad.com/blog/
Wiggly Wigglers
http://wigglywigglers.blogspot.com/
EIE FLUD
http://www.eieflud.co.uk/blog/
Eriska, Scottish Islan
http://www.isleoferiska.com/
Outdoor Landscape Lighting
http://www.residential-landscape-lighting-design.com/blogger.html
Thoughts of Beauty
http://www.overallbeauty.com/beauty-blog/
Stormhoek Winery
http://www.stormhoek.com/
Chevron Collectible Toy Cars
http://chevroncarsblog.com/
MSDN Blogs
http://blogs.msdn.com/
Ruby is Coming
http://rubyiscoming.blogspot.com/
am I lonely
http://rongsheng.blogspot.com/
Pineywoods Opinings
http://longleaf.blogspot.com/
Tangent, Oregon
http://tangentcity.blogspot.com/
Verizon - PoliBlog
http://poliblog.verizon.com/PoliBlog/Blogs/poliblog.aspx
Ted’s Take
http://ted.aol.com/
The Student LoanDown
http://blog.wellsfargo.com/StudentLoanDown/
Emerson Process Experts
http://www.emersonprocessxperts.com/
A Thousand Words
http://1000words.kodak.com/
Glenfiddich Blog
http://blog.glenfiddich.com/
IT@Intel Blog
http://blogs.intel.com/it/
All My Eye
http://allmyeye.blogspot.com/
HuffPo Full Blog Feed
http://www.huffingtonpost.com/theblog/
News@Cisco Notes
http://blogs.cisco.com/news/
Mobile Visions
http://blogs.cisco.com/wireless/
Open standards, open source, open minds, open opportunities
http://www-03.ibm.com/developerworks/blogs/page/BobSutor
Marriott on the Move
http://www.blogs.marriott.com/
NYT Editorials
http://topics.nytimes.com/top/opinion/editorialsandoped/editorials/
Washington Post Editorials
http://www.washingtonpost.com/wp-dyn/content/opinions/columnsandblogs/?nav%3Dleft⊂=new
LA Times Editorials
http://www.latimes.com/news/opinion/editorials/
Nov 23rd, 2006 | Corporate Blogging, Debbie Weil, Jonathan Schwartz, PR, Robert Scoble | No Comments
I just came across this short article in the Guardian, posted last week. It follows the usual modus operandi of mentioning Robert Scoble and Jonathan Schwartz (and Thomas Mahon of English Cut fame) and goes on to quote Debbie Weil numerous times (not that there’s anything wrong with that).
But the real gem is right at the beginning of the piece:
When The Carphone Warehouse boss Charles Dunstone started his corporate blog earlier this year, he was hailed as a cutting-edge chief executive; a man prepared to open up the inner workings of his company to the wider world and willing to communicate directly with his customers.
But that was April, when Britain’s biggest mobile phones retailer was riding high on a wave of favourable publicity about its “free” TalkTalk broadband offer.
Scroll forward a few months and the web is full of tales of “My TalkTalk Hell” as the group struggles to cope with the demand it so badly under-estimated, leaving thousands of customers angry and frustrated.
So what did Dunstone do at the height of the crisis? He simply stopped blogging. From September 1 until earlier this week - two and a half months - he failed to make a single entry. His post this Monday largely consists of an apology for his lengthy absence and a reassurance that the broadband supply problems are being worked out.
Ouch. If there’s one general, universal rule of business blogging it’s in the midst of a crisis, silence is not golden. Posting positive messages while the sailing is smooth is fine, but if there’s any time when a blog is almost indispensable, it’s when things go awry. Why? Because a blog is by far the best channel to make clear beyond doubt that
a) you recognize that there’s a problem
b) you’re sorry
If you aren’t convinced that those two aspects are extremely relevant, ask these guys about it. It’s a bit like Seth Godin once pointed out in a very interesting presentation at Google. Godin shocked his listeners by telling them something both harsh and true: nobody cares about your product. I believe he later qualified the statement - obviously a lot of people do care about Google’s products - but in assuming a complete lack of interest and “passion” on the side of customers regarding the phone service, dog food or toilet paper that you sell, you’re usually on the safe side. And the same largely holds true for companies. If wireless provider X is reliable and moderately priced, will I actively seek out X CEO’s blog to add my praise? Not too likely. But once things go wrong - once I’m frustrated and annoyed and quite sure that nobody is doing anything at all about my problem - then I’m going to post a comment on the company blog and make sure that I’m heard.
Silence leaves a barn door open for interpretation. Explaining and apologizing are basic social abilities - a lack of them indicates that you don’t understand how interpersonal interaction works, or (even worse), that you understand quite well but don’t care.
Mr. Dunstone didn’t realize that he was saying a whole lot by not saying anything. Don’t make that mistake.
Oct 7th, 2006 | Corporate Blogging, Robert Scoble, Thesis | 7 Comments
As promised earlier, today I’m going to look at how Robert Scoble’s blog differs from other corporate blogs, and from blogs in general (apologies for the delay, this should have been up two days ago).
The earlier entry focused on a number of language-related statistics: word length, sentence length, words per post etc. In this second step, I want to look at the distribution of individual words in the three different collections analyzed and draw some (lofty) conclusions based on the results.
Here are the top ten most frequent words for Scobleizer, the corporate blogs collection and the random blog comparison group:
Scobleizer
Rank Word Frequency
1 THE 625
2 TO 442
3 A 431
4 I 414
5 AND 332
6 OF 313
7 IS 255
8 THAT 243
9 ON 221
10 IN 175
Corporate Blogs
Rank Word Frequency
1 THE 35432
2 TO 19714
3 AND 17692
4 A 16457
5 OF 16154
6 IN 11110
7 IS 8475
8 THAT 7819
9 I 7342
10 FOR 7220
Random Blogs Comparison Group
Rank Word Frequency
1 THE 4374
2 TO 2985
3 AND 2975
4 I 2951
5 OF 2097
6 A 2025
7 IN 1335
8 YOU 1146
9 THAT 1120
10 MY 1065
At first glance, you’re likely to think that the three lists look very alike. This is not unusual in any way - in virtually any given English text “THE” will rank at number 1, whether you are looking at the Bible or Personal Finance for Dummies. The same is the case with common function words such as prepositions, which form the basic building blocks of pretty much any text you can come across.
An interesting variation that I want to focus on for the moment is the distribution of the personal pronoun “I” and the possessive determiner “MY”. Both for Scobleizer and the Random Blog Comparison Group “I” ranks at number 4, well ahead of any other pronouns (for example “WE”). In the Corporate Blogs Collection “I” is at rank number 9, making it significantly less frequent. Further down the list, “MY” ranks at 14 in Scobleizer and at 28 in the Corporate Blogs Collection. Consequently, “WE” ranks higher in Corp. Blogs than it does it the other two collections.
Big surprise there, you might think. Obviously Scoble speaks only for himself, thus he is unlikely to use “WE” as frequently as it is used in blogs on corporate responsibility or policy, most of which are authored by a team of people. Even in those cases where there is just one author, he or she often prefers the corporate “WE”, especially when the person in question is an executive. And of course there’s the possibility of largely writing without a personal agent. What is intriguing to me, however, is just how close Scoble is to the Random Blogs Group in regards to “I”-use. The Random Blogs Group largely consists of blogs written by teenagers, housewives, activists and other private individuals. As with their writing, the question of personal involvement is always relevant in Scoble’s blogging - it all relates to him as an individual in some way. I find it likely that this level of involvement in turn engages his readers more strongly than a less personal (that is, “self-centric”) approach would. Telling others about yourself serves a social function; it allows them to empathize with you, to better understand your motifs. “Talking about yourself” does not necessarily always mean relating thoughts or emotions, though. Scoble very often describes where he is and what he is doing because this gives his readers a better understanding of who he is, which allows them to better judge whether they value his opinion on whatever gadget, trend or company he then proceeds to discuss. He makes a conscious effort to overcome the decisive asymmetries in the relationship with his readers: the fact that they aren’t in the same place at the same time as he is. When you’re having a chat with your friend, all or some of the following apply:
- you are physically in the same place, at the same time
- you can hear the other person’s voice
- you can see the other person
- the other person is actively addressing you
- you can immediately respond to what he or she is saying
In a real-life, face-to-face conversation all of these points usually apply. In a technically mediated interaction, whether it’s texting on AIM or talking on the phone, normally some (but not all) criteria are applicable. The more of them are, the closer the interaction resembles a “real” conversation, simply because a real-life conversation has all of these characteristics. Notice how blogs are different. Only the last point works - you can respond to a blog, but not quite immediately. A blog author is very unlikely to exclusively address just one other person; the readership is usually plural and largely unknown.
So what does Scoble do to overcome these limitations? He tells you where he is and what he’s doing to make the kind of communication between him and his readers seem more like a conversation. Of course you could argue that it really is a conversation since you can respond to him - and you’d be right -, but he aims to overcome the other impairments as well. The innovation here is that Scoble doesn’t pretend to address his readers directly (unless he really is responding to another blogger) since he doesn’t really know who they are individually. Instead he focuses on his part of the equation by making sure that you know where he’s coming from and where he’s going with something – both physically and metaphorically speaking.
While the figures cited above are pretty vague indicators which should not be over-interpreted, I think they support the basic idea that blogs can function as time-delayed conversations and are naturally used in that manner by individuals. When organizations blog they are confronted with their inherent inability to have conversations in the same way that individuals do. The options are thus to either let individuals speak for the company – which is risky for a plethora of reasons – or to (mis)use blogs as a broadcast medium. I’m not even saying that the latter can’t work, just that people are likely to be very critical of such a usage, because they expect blogs to work differently.
One thing to always keep in mind is that you’re not real to your readers unless you have a face, name, identity and physical location. We like to think that we can relate to abstractions just as easily as we relate to concrete things, but our instincts often say otherwise.
Oct 2nd, 2006 | Corporate Blogging, Robert Scoble, Thesis | 1 Comment
Disclaimer: No bloggers were harmed in the course of this experiment.
As I’ve hinted at in the past, I’m in the process of building a textual database that contains thousands of posts culled from the RSS feeds of about a hundred corporate blogs, plus a comparison group of several “miscellaneous” blogs randomly picked through blogger.com and blog.com. The corpus currently has a little under 800,000 words and is expected to reach a round million words (or tokens) in about two to three weeks time.
So far, I’m just calculating a few very basic statistics: post word count, post sentence count and average word/sentence/post length, along with a top 100 list of the most frequent words. Though these are very basic figures, they nevertheless give a few interesting clues about the sources in question, especially when you compare one collection of blogs with another.
My test subject today will be Robert Scoble’s blog, Scobleizer. I’ll compare it to a) a large collection of other company blogs and b) a collection of randomly chosen non-corporate blogs. My reasons for picking Robert are pretty unspectacular. I happened to add him to the database fairly early on so that now I have a reasonable amount of data. Also, his immense popularity should make for some interesting results… note that I say “interesting” and not conclusive – a few language statistics don’t equate to the recipe for the Scoble Special Sauce of Blogging Fame. Anyway, let’s crunch a few numbers.
Scobleizer
Posts: 327
First Post to Last Post (FPLP): 2 August 2006, 03:26 - 30 September 2006, 22:07
Tokens / Types (Ratio): 17014 / 3743 (4.55)
Sentences (SC): 1950
Average Word Length (AWL): 4.9
Average Sentence Length (ASL): 10.1
Average Words per Post (AWpP): 52.9*
* not relevant because Scoble’s RSS doesn’t include complete posts but only summaries (the first 56 words)
Corporate Blogs
(Blogs: 107)
Posts: 4443
First Post to Last Post (FPLP): 2 May 2005, 00:00 - 2 October 2006, 00:50
Tokens / Types (Ratio): 667969 / 62230 (10.73)
Sentences (SC): 44350
Average Word Length (AWL): 5.5
Average Sentence Length (ASL): 15.9
Average Words per Post (AWpP): 155.1
Random Blogs Comparison Group
(Blogs: 18)
Posts: 576
First Post to Last Post (FPLP): 17 November 2004, 03:17 - 2 October 2006, 00:48
Tokens / Types (Ratio): 105253 / 16979 (6.2)
Sentences (SC): 10335
Average Word Length (AWL): 5.1
Average Sentence Length (ASL): 10.8
Average Words per Post (AWpP): 184.5
The stats
The first thing to note is that the three collections differ significantly in terms of size. The Scobleizer collection only has a size of 17,014 tokens (words), while both the corporate blog collection (667,969 tokens) and the random blogs comparison group (105,253 tokens) are much larger. This has strong implications for the accuracy of the figures, as a larger sample is obviously more accurate. The posts indexed in my database are not the total of posts made in those blogs, but only those which have been recorded since I began indexing a few months ago. Some entries date back several years, which is simply due to the fact that some of the RSS feeds which were used go back that far.
You might be wondering what on earth types are. Don’t worry, it’s really simple: while tokens are all words in a text, types are all unique words. So while the sentence “The cat ate the mouse” has 5 tokens, it only has 4 types because “the” occurs twice. The token-type-ratio for that sentence would be 5:4, or 1.25. As you can imagine, a long text will have a significantly larger number of tokens than types, since function words (pronouns, articles, prepositions etc) are re-used all the time, while lexical words (something like “blog”, “Google” or “greenish”) occur a lot less often.
The other statistics are pretty straight-forward: the number of total posts in the database, the time span from the first to the last post, the total number of sentences and three averages: average word length (AWL), average sentence length (ASL) and average words per post (AWpP). AWL refers to the number of characters in a word, while ASL in turn refers to the number of words in a sentence. As mentioned above, Scoble’s AWpP value should be ignored, since his RSS feed does not include complete entries but only summaries.
A cautious interpretation
The comparison shows that Robert Scoble uses shorter words and sentences than both the blogs in the random comparison group and those in the corporate blogging collection. Words are only slightly shorter (Scoble: 4.9; Corp.blogs: 5.5; Random blogs: 5.1) but it should be noted that variation in this category is normally not very strong, thus the difference between Scoble and the corporate blogs seems notable. The differences in sentence length (10.1; 15.9; 10.8) are even more pronounced: on average, the other corporate blogs have much longer sentences than Scoble, who is again a little below the average value of the random blogs. Finally, it cannot be determined if Scoble’s posts are shorter than those in the other two collections (52.9*; 155.1; 184.5) because his RSS syndicates only summaries, though my personal bet would be that they are. This is also the only category where the random group scores higher than the corporates.
So what does this mean? In one sentence, it means that on average Robert Scoble seems to use shorter sentences than most other corporate bloggers, and that the words he uses are also significantly shorter. Looking further, it appears that Scoble’s style – only speaking in terms of word and sentence length – is closer to that of non-corporate bloggers. However, these numerical statistics aren’t terribly exciting by themselves, which is why tomorrow I’ll take a peek at a list of the most frequently used words in our three source collections.
(to be continued)
Edit: My claim that Robert’s RSS feed does not contain full texts is bogus - my indexing tool was simply looking in the wrong place. I’ll correct the problem asap. Mea culpa.